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Chairman:  John  F.  Alexander 

Major  Department:  Urban  and  Regional  Planning 

This  thesis  reports  on  the  design  of  an  "analysis-enabling  framework"  for  more 
productive  use  of  geographic  information  systems  (GIS)  by  planners  and  decision  makers, 
through  the  integration  of  object-oriented  programming  and  database  management  with 
knowledge-based  systems  and  active  database  technology.  This  extends  the  information 
management  system  to  support  its  own  customization  by  those  who  use  it,  for  reflective 
adaptation  of  the  GIS  framework  to  applications  beyond  the  domains  or  scope  anticipated 
by  its  creators. 

The  Smalltalk  programming  environment  is  used,  along  with  an  object-oriented 
database  management  system  (ODBMS),  running  on  a  Unix  computer  platform.  This 
system  works  with  the  Defense  Mapping  Agency's  Vector  Product  Format  (VPF)  digital 
geographic  databases.  The  Smalltalk  application,  called  OVPF,  converts  source  data  from 
a  georelational  database  structure  to  Smalltalk  objects.  Full  spatial  topology  for  point,  line 
and  area  graphics  is  supported  using  the  winged-edge  algorithm,  as  well  as  many-to-many 
relationships  between  geographic  features  and  graphical  primitives. 
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OVPF  incorporates  a  quadtree  spatial  index  implemented  in  Smalltalk.  The 
quadtree  object  itself  is  placed  in  the  ODBMS  repository,  and  serves  all  queries  to  the  geo- 
feature  objects.  A  rule  based  framework  and  event  detection  mechanism  provide  a  reactive 
or  "triggering"  capability  for  enforcing  application-based  data  integrity  and 
interdependency  constraints  on  requests  to  update  geo-features.  This  effectively 
transforms  the  ODBMS  into  an  "active  database." 

OVPF  provides  a  graphical  user  interface  (GUI)  for  direct  interaction  to  view  and 
edit  geo-feature  objects.  Spatial  topology  is  maintained  during  feature  editing,  and  OVPF 
encapsulates  database  operations  within  atomic  transactions.  Additions  and  changes  can 
be  made  to  the  rule  base  at  run-time  that  take  effect  immediately. 

The  importance  and  contribution  of  this  research  is  in  the  use  of  Smalltalk's  unique 
object-oriented  data  modeling  capabilities  for  a  GIS  framework,  in  combination  with  a 
rule-based  active  repository  for  spatial  and  nonspatial  data.  This  approach  supports 
complex  interdependencies  among  geographic  features  in  potentially  very  large  databases. 
It  provides  an  extensible  metadata  framework,  and  the  potential  for  supporting  expert 
system  applications. 
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INTRODUCTION 


The  management  and  analysis  of  geographic  information  is  becoming  increasingly 
important  to  many  sectors  of  our  industrial  and  technological  society.  Local,  state  and 
federal  governments  use  geographic  data  to  study  and  forecast  population  and 
demographic  growth  patterns,  as  well  as  to  develop  comprehensive  plans  for  urban 
infrastructure  and  development  (Budic  1994).  Utility  companies  use  geographic 
information  to  plan  the  building  and  expansion  of  electrical,  gas,  water  and 
communications  facilities.  Private  industry  and  commercial  businesses  use  geographic 
information  to  study,  plan  and  monitor  their  marketing,  production  and  distribution 
strategies.  The  military  uses  geographic  information  for  strategic  and  tactical  mission 
planning,  mission  rehearsals,  logistics,  navigation,  and  many  other  applications.  The  uses 
and  interdependencies  of  geographic  information  are  growing  rapidly,  as  are  the  sources 
of  this  information  (Maguire  et  al.  1991;  Laurini  and  Thompson  1992). 

A  number  of  approaches  for  geographic  information  systems  (GIS)  have  been 
developed  to  support  access  to,  and  management  of,  such  data.  A  GIS  encompasses,  in 
part,  the  integrated  computer  hardware  and  software  required  to  store,  retrieve  and  update 
both  spatial  and  nonspatial  attributes  associated  with  a  database  of  geographic  features. 
Each  GIS  also  is  a  function  of  the  context  in  which  it  is  used,  and  embodies  a  set  of 
principles  and  procedures  for  the  collection,  analysis,  display  and  plotting  of  geographic 
data. 
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All  these  functions  however,  can  be  grouped  into  two  main  categories  of 
responsibilities  for  a  GIS:  information  management,  and  information  analysis.  The 
substantive  goal  of  GIS  technology  is  to  support  spatial  analysis  (Goodchild  1987)  and 
synthesis  for  the  purpose  of  understanding  and  predicting  patterns  of  behavior  among 
human  and  other  natural  communities  (Wellar  1989;  Wellar  et  al.  1994).  But  the  sheer 
volume  of  data,  and  the  complexity  of  interactions  among  geographic  entities,  has 
necessitated  considerable  effort  to  develop  the  means  for  simply  handling  the  data,  which 
has  often  seemed  to  overshadow  the  efforts  for  analyzing  it  (Ding  and  Fotheringham  1992; 

Lober  1995).' 

However,  rapidly  accelerating  advances  in  computer  software  and  hardware  may 
be  changing  this  picture.  The  evolution  of  software  technologies,  specifically  object- 
oriented  (00)  programming,  knowledge -based  (KB)  systems,  and  active  databases, 
together  with  the  exponentially  increasing  power  and  storage  capacity  of  affordable 
computers,  has  led  to  the  development  of  analysis-enabling  frameworks  for  more 
productive  use  of  the  information  by  planners  and  decision  makers.  These  enabling 
technologies  could  be  viewed  as  an  extension  of  an  information  management  system,  but 
an  important  distinction  is  that  information  management  systems  to  a  great  extent  are 
designed  to  serve  a  broad  range  of  applications  from  business  to  engineering,  while 
knowledge-based  enabling  technologies  are  designed  around  the  requirements  of  a 
particular  user  community. 

1.  This  division  of  effort  into  information  management  and  analysis  categories  parallels  the 
division  of  effort  in  urban  planning  itself  into  procedural  and  substantive  matters  (see  Faludi 
1973).  While  substantive  issues  generally  seem  the  most  important,  lack  of  attention  to  the 
procedural  issues  can  preclude  substantive  progress.  So  it  is  in  using  information. 
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This  is  a  report  of  an  enabling-teehnology  research  project  in  which  the  core 
functionality  of  an  object-oriented  geographic  information  system  (OOG1S)  was 
implemented  using  the  Smalltalk  programming  language,  with  a  commercial  object- 
oriented  database  management  system  (ODBMS)  for  the  repository.  The  OOGIS 
incorporates  a  knowledge-base  framework  which  effectively  transforms  the  ODBMS  into 
an  "active  database"  (these  terms  will  be  defined  shortly).  This  is  not  the  first  OOCHS 
framework  to  be  developed,  nor  is  it  the  first  KBG1S  or  active  database  to  be  developed.  It 
does,  however,  seem  to  be  a  unique  application  of  OO  principles  and  techniques  ( thanks  to 
Smalltalk)  to  the  design  of  a  KB  framework  for  GIS  that  is  simple,  "light-weight"  and 
extensible. 

This  development  was  the  result  of  research  funded  by  the  U.  S.  Naval  Research 
Laboratory  (NRL)  and  the  U.  S.  Defense  Mapping  Agency  (DMA),  to  study  alternative 
ways  of  representing  and  managing  Vector  Product  Format  (VPF)  digital  geographic- 
databases.  VPF  is  a  specification  developed  by  DMA  for  a  family  of  database  products 
(DMA  1993a)  that  has  become  part  of  the  DIGES  T  international  standard  formats  for 
representing  geographical  data  (DMA  1994a).  VPF  is  a  "georelational"  specification 
which  uses  a  relational  data  framework  (Date  1995)  for  storing  both  spatial  and  nonspatial 
attribute  information  about  the  geographic  features  represented.  A  number  of  database 
products  representing  refinements  to  (he  VPF  standard  have  been  developed,  such  as  the 
Digital  Nautical  Chart  (DMA  1993b).  Vector  Smart  Map  (DMA  1993c),  Urban  Vector 
Smart  Map  (DMA  1994b),  World  Vector  Shoreline  (DMA  1995).  and  others.  Each  of 
these  VPF  derivatives  has  different  purposes,  and  thus  different  sets  of  geographic  features 
and  attributes;  in  some  cases  even  different  metadata  (database  schema)  structures. 


The  purpose  behind  the  development  of  VPF  was  two-fold:  (1)  to  provide  a  public 
specification  for  exchange  of  geographic  data  across  computer  platforms  and  GIS 
software  products,  and  (2)  to  support  direct  viewing  capability  without  the  need  for 
proprietary  GIS  software  (DMA  1993a).  In  its  first  purpose,  VPF  occupies  a  similar  role 
to  the  Spatial  Data  Transfer  Standard  (SDTS)  developed  jointly  by  the  U.  S.  Census 
Bureau  and  the  U.  S.  Geological  Survey  (see  National  Institute  of  Standards  and 
Technology  1992;  Lazar  1992;  Fegeas  et  al.  1992;  Davis  et  al.  1992;  Milne  et  al.  1993). 
SDTS  now  defines  the  format  used  for  Census  Bureau  TIGER  data  (Klosterman  and  Lew 
1992). 

VPF  and  its  derivative  specifications  have  evolved  over  several  years,  and  are  just 
now  maturing  to  the  point  that  large-scale  production  and  distribution  of  geographic 
databases  on  CD-ROM  are  taking  place.  One  of  the  problems  these  database  products 
present  however,  is  that  the  feature  data  is  very  difficult  to  edit  due  to  its  inherent 
complexity.  That  is,  once  a  VPF  database  has  been  created  (usually  by  exporting  coverage 
data  from  a  commercial  GIS  product),  it  is  very  difficult  to  modify  the  feature  data  while 
maintaining  referential  integrity  of  the  many  linkages  between  features,  attributes, 
graphical  primitives,  and  the  various  indexes  and  join  tables.  The  single  greatest  source  of 
complexity  is  the  attempt  to  represent  the  locational  and  topological  aspects  of  geographic 
features  using  the  relational  database  model.  Commercial  GIS  products  have  typically 
dealt  with  this  representation  through  the  use  of  proprietary,  non-relational  data  structures 
and  techniques.  In  following  standard  rules  for  normalization  (Date  1995),  the  VPF 
specification  has  possibly  reached  the  practical  limits  of  what  can  be  represented  and 
maintained  using  relational  database  technology. 
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In  its  search  for  ways  to  address  these  issues,  the  Digital  Mapping  Program 
(DMAP)  at  NRL  sought  the  help  of  the  GeoPlan  Center  in  the  Department  of  Urban  and 
Regional  Planning,  and  sponsored  the  current  research  project  to  develop  an  in-house 
(DMA-owned)  viewer/editor  capable  of  displaying  and  modifying  VPF  source  data. 
NRL's  decision  to  sponsor  the  project  here  was  based  on  the  results  of  prior  work  at 
GeoPlan  on  an  OOGIS  product  to  support  automated  mapping  and  facilities  management 
applications  (commonly  abbreviated  AM/FM)  for  large,  regional  electric  utility 
companies.  This  project,  called  Object  GPG  (Alexander  et  al.  1991)  and  later  OFM  (for 
Objective  Facilities  Management),  had  developed  a  core  set  of  object  classes  and  methods 
in  Smalltalk  that  could  be  used  without  license  restrictions  as  a  starting  point  for  NRL's 
VPF  viewer/editor.  I  was  in  a  position  to  lead  development  of  the  VPF  viewer/editor, 
having  worked  on  OFM  for  the  previous  year. 

As  the  DNC  specification  (DMA  1993b)  is  one  of  the  most  complex,  it  was  chosen 
by  NRL  to  be  studied  first.  By  the  end  of  the  first  project  year,  I  had  completed  a  viewer/ 
editor  prototype  in  Smalltalk  that  was  capable  of  displaying  and  editing  DNC  feature  data. 
This  prototype  was  called  ODNC,  with  the  results  appearing  in  a  refereed  conference  the 
following  year  (Arctur  et  al.  1995c).  At  this  stage,  the  project  received  much  more 
attention  and  support  from  DMA.  Our  equipment  was  upgraded  from  aging  workstations 
to  Sun  SPARCstation20's.  Two  more  programmers  were  added  to  the  project  at  NRL,  and 
one  to  two  additional  graduate  assistants  worked  on  the  project  at  GeoPlan.  By  the  end  of 
the  second  year  of  the  project,  we  had  integrated  World  Vector  Shoreline,  Vector  Smart 
Map  Level  0,  and  Urban  Vector  Smart  Map  database  definitions  and  feature  data  into  the 
object-oriented  framework  (now  called  Object- Vector  Product  Format,  or  OVPF),  and  had 
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incorporated  the  use  of  ObjectStore  by  Object  Design,  Inc.,  as  an  object-oriented  database 
repository  for  VPF  source  data.  We  also  had  implemented  full  spatial  topology  (missing 
from  the  first-year  prototype;  see  Chung  et  al.  1995);  a  novel  splay-tree  indexing 
mechanism  to  improve  spatial  query  performance  (Cobb  et  al.  1995b);  and  a  rule-base 
framework  for  enforcing  logical  constraints  on  feature  updates  (Arctur  et  al.  1995d). 

Although  the  OVPF  data  model  and  Smalltalk  application  were  developed  to 
address  specific  needs  of  DMA,  the  techniques  and  lessons  learned  also  seem  well  suited 
to  GIS  applications  for  urban  and  environmental  analysis  and  planning,  as  well  as 
facilities  management  applications  in  the  various  public  utility  industries  such  as 
electrical,  gas,  water  and  communications.  My  main  interest  is  to  show  the  implications 
and  significance  of  this  OO-KB-GIS  framework  for  various  GIS  users.  However,  much  of 
the  thesis  is  necessarily  devoted  to  explaining  the  historical  context  and  the  development 
of  the  framework  itself.  The  remainder  of  this  Introduction  is  in  four  main  parts:  (1)  a 
statement  of  the  specific  technical  goal  and  objectives  for  this  research;  (2)  a  more 
detailed  look  at  the  VPF  data  structures;  (3)  a  review  of  the  technological  context  and  state 
of  the  art  from  which  the  current  research  proceeds;  and  (4)  a  synthesis  of  the  preceeding 
review  to  show  how  the  various  technologies  intersect  in  this  project. 

Following  the  Introduction,  the  Materials  and  Methods  section  describes  specific 
tasks  which  were  addressed  in  working  toward  the  final  product,  essentially  "recipes"  for 
constructing  the  key  components  of  an  OO-KB-GIS  framework.  As  the  OVPF  program  is 
rather  large  (about  400  classes  and  4500  methods,  with  over  a  megabyte  of  source  code),  it 
is  impractical  to  describe  its  complete  structure  here.  Therefore  I  will  focus  on  the  key 
constituent  frameworks  within  the  overall  design  that  most  directly  contribute  to  the  stated 
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objectives.  The  Results  section  then  provides  an  illustration  of  the  integrated  design  in 
action,  to  show  how  the  finished  program  meets  each  of  the  stated  objectives.  The 
Discussion  section  concludes  this  work  by  addressing  the  implications  of  the  findings,  as 
well  as  the  limitations  and  directions  for  future  work  with  this  approach. 

Goal  and  Objectives  of  the  Research 

This  work  began  in  response  to  the  perception  that  traditional  GIS  data  modeling 
and  analysis  approaches  are  becoming  inadequate  to  support  the  complex  and 
interdependent  nature  of  geographic  data.  Data  models  once  thought  to  be  fairly  flexible 
now  are  seen  to  impose  significant  constraints  on  the  way  the  data  can  be  used.  This  is 
becoming  an  increasing  problem  as  our  acquisition  of  huge  amounts  of  detailed  data 
accelerates. 

In  a  similar  way,  the  GIS  tools  themselves  can  be  difficult  to  apply  to  a  given 
problem  as  a  result  of  the  often  brittle  way  in  which  they  are  designed.  The  GIS  software 
used  for  most  governmental  and  military  applications  requires  extensive  training  and 
continuous  practice  to  develop  and  maintain  proficiency.  Then,  even  with  proficiency  it 
can  be  difficult,  time  consuming  and  frustrating  to  apply  to  a  given  problem.  It  seems  that 
even  one  of  the  most  sophisticated  GIS  software  products,  Arc/Info  by  Environmental 
Systems  Research  Institute  (ESRI)  has  fairly  rigid  data  structures  that  work  well  for 
single-theme  maps,  but  do  not  support  the  kind  of  interdependencies  that  exist,  for 
example,  among  geographic  features  in  facilities  management  applications  for  electrical 
and  other  utilities  industries.  It  seems  inevitable  to  me  that,  with  the  rapidly  changing 
environmental  and  societal  conditions  facing  us,  people  will  continue  to  find  or  create 
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perfectly  reasonable  GIS  applications  for  which  existing  software  systems  and  tools  are 
not  easily  adapted.  . 

The  goal  of  this  research  project  is  to  design  and  demonstrate  ways  of  representing 
and  working  with  the  complexity  of  both  the  geographic  information  and  the  GIS  tools 
that  permit  flexible  adaptation  to  changes  in  requirements  for  the  data  models  over  time. 
To  this  end,  a  set  of  objectives  the  GIS  tools  need  to  meet  include 

•  representing  potentially  detailed  and  complex  interdependencies  among 
geographic  features  in  a  database,  in  a  user-extensible  way; 

•  supporting  very  large  geographic  databases,  potentially  distributed  over  a  network 
of  computers;  and 

•  providing  the  capability  to  incorporate  expert-system  rules  and  behavior. 
Each  of  these  objectives  will  be  discussed  briefly  below. 

Managing  Complex  Interdependencies  Among  Geographic  Features 

Each  application  domain  comes  with  its  own  set  of  rules.  In  planning  an  electrical 
service  extension  for  a  new  urban  subdivision,  the  electric  utility  engineer  has  to  take  into 
account  a  myriad  of  prerequisites  and  corequisites  in  order  to  place  any  one  electrical 
system  facility,  such  as  a  high-voltage  circuit  breaker.  Even  to  plan  a  simple  facility  such 
as  an  overhead  capacitor,  the  engineer  must  first  determine  that  a  power  pole  is  in  place, 
and  that  such  a  pole  is  rated  for  the  capacitor,  and  that  the  capacitor  is  placed  on  the  proper 
circuit,  and  so  on.  Telecommunications  equipment  and  circuits  can  involve  even  more 
complex  interconnections  than  electrical  utility  services.  Thus,  in  addition  to  support  the 
drawing  of  maps  of  electrical  and  communications  circuits,  it  is  increasingly  important 


that  the  GIS  tools  support  the  engineer  to  build  the  maps  completely  and  correctly. 
Essentially,  a  means  of  incorporating  a  facility  engineer's  books  of  policies  and  practices 
into  the  GIS  tools  would  be  a  tremendous  aid  for  such  a  user.  An  additional  benefit  the 
GIS  can  provide  is  to  help  manage  the  inventory  and  accounting  of  installed  facilities. 

This  could  also  be  said  for  a  quite  different  application  domain  such  as  land  use 
and  building  codes  enforcement  in  a  city  or  county  government  jurisdiction.  For  example, 
when  a  developer  wishes  to  build  or  modify  a  residential  or  commercial  establishment, 
many  detailed  conditions  must  be  met,  such  as  the  number  of  parking  spaces  required 
based  on  square  footage  of  the  buildings;  setbacks  and  easements  based  on  proximity  to 
roads  or  power  lines:  and  so  on.  While  these  rules  are  generally  mastered  reasonably 
quickly  by  those  responsible  for  their  enforcement,  this  is  less  true  in  older  and  denser 
urban  areas.  Furthermore,  many  rules  are  open  to  interpretation,  and  are  not  always 
applied  uniformly  but  can  be  applied  strictly  or  loosely  depending  on  an  official's 
preference.  In  addition,  turnover  among  code  enforcement  officers  is  very  high  in  many 
offices,  resulting  in  frequent  retraining  (Heikkila  and  Blewett  1992). 

Most  of  the  existing  software  used  in  map  production  and  GIS  is  not  well  suited 
for  adaptation  to  handling  a  rule  base  for  describing  complex  interdependencies.  Presently, 
there  is  no  way  to  represent  application-domain-specific  dependency  rules  among 
geographic  features  with,  for  example,  AutoCAD  by  AutoDesk,  an  inexpensive  and 
popular  technical  drawing  software  product  which  is  often  used  in  place  of  a  GIS.  With 
sophisticated  GIS  software  such  as  Arc/Info  and  Intergraph,  this  is  still  not  easily  done. 
One  approach  which  works  in  very  limited  situations  is  to  assign  an  "impedance" 
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(resistance  to  flow)  between  two  proximate  geographic  features,  but  this  does  not  capture 
enough  of  the  semantics  of,  say,  an  electrical  power  network  to  be  useful. 

Supporting  Very  Large  Geographic  Databases 

A  single  geographic  database  can  vary  from  a  few  megabytes  to  several  gigabytes 
and  even  terabytes.  The  volume  of  data  to  be  generated  by  remote-sensing  satellites  will 
reach  a  level  of  terabytes  per  day  in  a  few  more  years.  When  attempting  to  perform  queries 
and  analysis  with  multiple  databases  in  combination,  it  may  not  be  practical  for  all  this 
data  to  reside  on  a  single  host  computer  or  even  on  a  single  local  network.  The  GIS 
framework  needs  to  support  access  to  an  arbitrarily  large  collection  of  data  that  may  be 
distributed  across  an  entire  wide-area  network. 

Supporting  Reactive  Capability  in  the  GIS 

As  the  complexity  of  interdependencies  and  size  of  databases  increase,  it  will 
become  steadily  more  important  to  find  ways  for  the  GIS  to  assist  the  analyst  and  the 
database  administrator  in  maintaining  data  consistency  and  integrity.  This  could  be  in  the 
form  of  data  integrity  constraint  support,  as  well  as  support  for  more  complex  decision 
processes  typical  of  expert  system  applications.  (Several  examples  of  these  will  be 
presented  shortly.) 

To  minimize  the  technical  overhead  of  such  assistance  on  the  user,  it  is  helpful  and 
in  some  cases  necessary  for  the  GIS  to  have  a  flexible,  consistent,  and  automatic  way  of 
reacting  during  attempts  to  access  and  modify  geographic  feature  data.  It  would  also  be 
important  for  the  reactive  capability  to  be  based  on  conditions  anywhere  in  the  complete 
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database,  and  not  just  on  the  portions  of  the  database  currently  loaded  into  the  computer's 
active  memory.  This  is  essentially  what  is  meant  here  by  the  term  active  database;  these 
are  designed  in  such  a  way  that  application-defined  events  can  trigger  rule-based  actions 
based  on  conditions  in  any  part  of  the  disk-resident  database  (Widom  and  Ceri  1996, 
Chapter  1). 

For  example,  suppose  a  homeowner  wished  to  get  a  building  permit  to  expand  her 
house  on  her  own  property.  Suppose  also  that  her  property  contained  species  of  plants  that 
cause  her  property  to  be  considered  wetlands,  which  might  very  well  preclude  her  right  to 
build  any  further  on  her  property,  on  legal  grounds.  The  municipal  building  codes 
enforcement  officer  needs  to  be  aware  of  all  such  rules,  as  well  as  the  particulars  with 
respect  to  each  applicant's  property,  to  apply  the  rules  in  an  objective  and  accurate  way. 
Assuming  pertinent,  accurate  data  exists  in  the  GIS  database  on  which  a  correct  decision 
could  be  based,  the  GIS  could  notify  the  codes  official  immediately  of  all  such  pertinent 
conditions  at  the  earliest  opportunity,  thus  precluding  potentially  costly  re-evaluation  at  a 
later  date  due  to  obscure  conditions  that  were  not  noticed  earlier. 

Zoning  for  land  use  is  becoming  an  increasingly  complex  concern  for  city  and 
county  planners.  Increasing  land  scarcity  and  values  raise  the  costs  and  potential  for 
litigation  as  competition  and  undesirable  interactions  among  different  zones  in  close 
proximity  become  more  common.  Expert  systems  that  can  take  into  account  the  various 
constituents'  preferences  in  anticipation  of  future  problems  could  be  a  very  useful  tool  for 
planners.  Again,  the  GIS  needs  some  form  of  reactive  capability  to  support  this. 

This  concludes  the  discussion  of  goals  and  objectives  for  my  research.  The  next 
section  presents  an  overview  of  the  VPF  database  structure,  which  serves  as  the  data 
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source  for  the  proof  of  concept  described  in  this  thesis  to  address  the  above  objectives. 
Following  the  VPF  description  is  my  review  of  the  technological  evolution  which  has  led 
to  the  development  of  the  principal  concepts,  tools  and  techniques  which  are  integrated  in 
this  project. 

Vector  Product  Format  Database  Structure 

For  purposes  of  discussion,  the  Digital  Nautical  Chart  (DNC)  is  considered 
representative  of  the  general  structure  of  all  Vector  Product  Format  (VPF)  products,  and 
serves  throughout  this  thesis  as  the  concrete  example  of  VPF  to  illustrate  the  definitions 
and  linkages  among  features,  attributes,  and  primitive  graphical  elements.  One  of  the  more 
complex  VPF  products,  DNC  was  specifically  designed  to  support  GIS  applications  such 
as  marine  navigation.  As  with  other  VPF  products,  DNC  geographic  data  is  organized  for 
distribution  on  CD-ROM  disks  where  each  disk  or  disk  set  contains  the  database  of 

geographic  information  for  a  particular  region.  For  example  (see  Figure  1),  the 

Chesapeake  Bay  area  surrounding  Norfolk,  Virginia  has  been  coded  as  database  DNCOI } 

This  database  is  organized  using  the  hierarchical  directory  structure  shown  in  Figure  2. 
The  name  of  the  topmost  directory  is  also  the  name  of  the  database.  The  DNCOI  directory 
contains  two  files:  the  Database  Header  Table  (DHT)  which  provides  general  information 
about  the  database  (source,  date  of  creation,  revision  level,  etc.)  and  the  Library  Attribute 
Table  (LAT),  which  provides  the  boundaries  of  each  library  in  terms  of  decimal  degrees  of 


2.  Note:  San  serif  typeface  is  used,  e.g.,  DNCOI ,  throughout  this  thesis  to  represent  actual 
filenames  or  parts  of  filenames  of  database  components  on  the  CD-ROM,  as  well  as  to 
represent  Smalltalk  programming  code. 
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latitude  and  longitude.  As  defined  in  the  VPF  specification  and  illustrated  in  Figure  1,  a 
library  defines  a  geographic  boundary  and  scale,  where  a  larger  scale  implies  a  closer-in 
view  and  a  smaller  scale  implies  a  further-out  view.  Thus,  the  Norfolk  Approach 
(AO  1 08280)  library  has  a  smaller  scale  and  presumably  lesser  accuracy  and  precision  of 
data  than  in  the  Norfolk  Harbor  (HO  1 08280)  library.  A  given  database  will  have  one  or 

more  library  directories;  in  Figure  2,  for  example,  are  shown  portions  of  two  of  the  library 
directories,  AO  1 08280  and  HO  1 08280. 


AO  1 08 1 70 

(Ocean  City  Approach) 


HO  1 08280  (Norfolk  Harbor) 


AO  1 08280  (Norfolk  Approach) 


GEN0I  (General) 
COA0I  (Coastal) 


Figure  1 .  Libraries  for  Chesapeake  Bay  Area  (DNC01)  Database 

A  library  subdirectory  is  further  divided  into  coverages,  each  of  which  contains  the 
data  for  logically-  and  spatially-organized  groups  of  geographic/eagres.  For  example,  the 
Cultural  Landmarks  coverage  (CUL)  includes  buildings,  power  lines,  streets,  railroads  and 


other  feature  classes.  The  Inland  Waterways  coverage  (IWY)  includes  features  such  as 
canals,  lakes,  rivers  and  dams. 
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Figure  2.  DNC01  Database  Directory  Structure  (partial) 
Source:  (Arctur  1995c) 


Within  a  given  coverage  directory,  the  geographic  feature  data  is  divided  into  two 
main  groups  of  files:  those  that  describe  feature  attributes,  and  those  that  describe  feature 
locations.  Those  files  describing  feature  attributes,  for  example  building  type,  road  type, 
accuracy  level  and  so  on.  are  stored  in  the  coverage  directory.  The  files  describing  feature 
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locations  are  stored  in  tile  subdirectories  of  the  coverage  directory,  where  a  tile 
corresponds  to  a  rectangular  subregion  within  the  library's  boundary.  Tile  size  is  a 
function  of  the  library's  scale.  For  example,  tiles  are  15  minutes  (0.25  degrees)  of  latitude 
or  longitude  on  each  side  for  Harbor  libraries;  30  minutes  (0.5  degrees)  on  each  side  for 
Approach  libraries;  and  3  degrees  on  each  side  for  Coastal  and  General  libraries. 

As  shown  in  Figure  2,  the  files  describing  feature  attributes  are  further  grouped 
according  to  their  level  of  generality.  The  files  for  Feature  Class  Attributes  (FCA),  Integer 
Value  Description  Table  (INT.VDT),  and  Character  Value  Description  Table  (CHAR.VDT) 
contain  descriptive  information  concerning  all  feature  attributes.  The  Feature  Class 
Schema  (FCS)  file  contains  table-join  relationships  for  many-to-many  relationships  that 
may  exist  between  feature  tables,  associated  notes  and  other  tables  {notes  tables  are 

omitted  from  Figure  2  for  simplicity).  The  feature-specific  attribute  value  detail  is  stored 
in  the  Building  Points  Feature  Table  (BUILDNGRPFT),  Power  Lines  Feature  Table 
(POWERLLFT),  and  other  such  feature  tables.  The  most  important  join  tables  are  the 
Entity  Node  Feature  Index  Table  (END. FIT),  Edge  Feature  Index  Table  (EDG.FIT),  Face 
Feature  Index  Table  (FAC.FIT)  and  Text  Feature  Index  Table  (TXT. FIT).  The  Feature 
Index  Tables  (FIT  files)  are  provided  to  relate  each  record  of  the  Feature  Tables  (PFT,  LFT, 
AFT,  TFT  files)  to  their  associated  graphic  primitives  in  one  or  more  of  the  tile 
subdirectories.  Other  join  tables  and  index  files  may  also  be  employed,  as  defined  in  the 
VPF  and  derivative  product  specifications. 

A  major  aspect  of  geographic  data  and  the  VPF  specification  that  complicates  its 
relational  structure  is  spatial  topology.  Any  single  geographic  feature  (such  as  a  river) 
might  consist  of  multiple  line  segments,  called  graphical  primitives.  A  given  line 
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segment,  in  turn,  might  be  a  part  of  more  than  one  spatial  feature  (such  as  part  of  the  river 
and  an  adjacent  property  boundary).  The  topology  (adjacency  and  contiguity)  properties 
of  VPF  features  are  stored  and  managed  at  the  graphical  primitive  level,  within  the  tile 
subdirectories.  VPF  specifies  a  "winged-edge  topology"  model  (see  Figure  3)  to  provide 
"line  network  and  face  topology,  and  also  to  maintain  seamless  coverages  across  a 
physical  partition  of  tiles."  (DMA  1993a,  Appendix  B,  p.  105). 


Figure  3.  Winged-Edge  Topology  Components 
Source:  DMA  1993a,  p.  106. 


In  terms  of  the  database  files,  the  geographic  coordinate  data  is  organized  into 
Entity  Node  (END),  Connected  Node  (CND),  Edge  or  polyline  (EDG),  Face  or  polygon 
(FAC),  and  Text  (TXT)  files.  Entity  Node  records  have  a  foreign  key  to  their  containing 
face  primitive  record;  Connected  Node  records  have  a  foreign  key  to  their  starting  edge 
primitive  record;  and  Edge  records  have  foreign  keys  to  their  start  node,  end  node,  left 
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face,  right  face,  left  edge  and  right  edge  primitive  records.  Face  primitive  records  include 
a  foreign  key  to  the  Ring  (RNG)  table,  which  indicates  the  starting  edge  primitive  for  each 
face.  The  winged-edge  topology  algorithm  (DMA  1993a,  Appendix  B,  pp.  108-111) 
describes  the  procedure  by  which  a  face  primitive  is  assembled  from  tracing  the 
comprised  edge  primitives. 

Text  primitives  consist  of  a  textual  label  and  a  shape  line  that  describes  the 
location  and  path  along  which  the  text  label  is  to  be  displayed.  Text  features  are  not 
topological  structures,  but  simple  cartographic  elements  for  identifying  certain  features  at 
an  arbitrary  location,  such  as  the  name  "Chesapeake  Bay." 

Edge  and  Text  primitives  have  variable-length  records,  as  they  may  consist  of  an 
arbitrary  number  of  locational  points.  To  facilitate  faster  access  to  these  primitives,  VPF 
specifies  additional  index  files,  named  for  Edge  Index  (EDX)  and  Text  Index  (TXX) 
respectively,  to  be  stored  in  each  tile  subdirectory.  The  direct  byte  offset  for  each  record  of 
the  Edge  or  Text  primitive  file  is  stored  in  the  associated  Edge  Index  or  Text  Index  file, 
sorted  by  the  primary  key  in  the  tile-level  Edge  or  Text  primitive  file. 

See  "Representation  of  Graphical  Primitives  and  Topological  Relationships" 

starting  on  page  55  of  this  thesis,  for  more  details  on  the  implementation  of  topology  in 
OVPF.  See  (Chung  et  al.  1995)  for  issues  and  techniques  concerning  maintenance  of 
topology  during  geo-feature  editing  in  OVPF 

There  is  still  another  layer  of  data  required  to  represent  VPF  features,  which  is  a 
spatial  index  for  each  coverage.  VPF  specifies  an  adaptive  binary  tree  framework  for 
managing  spatial  indexes  of  point,  edge  and  face  primitives  (DMA  1993a,  Appendix  F). 
Spatial  tree  cells'  keys  are  stored  in  additional  index  files  for  association  with  their 
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contained  features  for  use  in  spatial  queries  and  display.  The  OVPF  prototype  viewer 
currently  uses  a  more  efficient  quadtree  spatial  data  manager  (after  Samet  1994)  instead 
of  the  adaptive  binary  tree  system.  See  "Design  of  an  Object-Oriented  Spatial  Index" 

starting  on  page  60  of  this  thesis  for  more  details  on  the  quadtree  implementation  in 
OVPF. 

A  useful  aspect  of  the  VPF  specification  is  that  the  relational  files  have  their 
schema  description  within  each  file's  header.  This  facilitates  dynamic  interpretation  and 
processing  of  feature  data,  as  well  as  a  means  of  coping  with  some  of  the  differences  in 
structural  specifications  among  the  various  VPF  products. 

To  give  an  idea  of  the  complexity  of  a  single  VPF  database,  one  8-megabyte 
library  for  the  Norfolk  Harbor  database  has  about  25,000  geographic  features  (considered 
a  relatively  small  data  set),  and  uses  over  1500  separate  fdes  to  describe  the  location, 
topology,  and  other  attributes  of  these  features  (this  seems  like  a  much  larger  number  of 
files  than  one  might  expect,  for  just  eight  megabytes  of  data).  Given  the  high  degree  of 
interdependency  among  features  and  graphical  primitives,  it  is  thus  difficult  to  manage 
even  simple  changes  to  the  location  of  a  spatial  feature,  while  assuring  referential  integrity 
throughout  the  coverage. 

Historical  Technological  Developments 

Many  threads  of  development  have  contributed  to  the  present  research,  which  will 
be  loosely  categorized  according  to  the  period  and  technology  represented.  These  start 
with  pre-GIS  large-scale  urban  models,  object-oriented  programming  systems,  both 
relational  and  object-oriented  database  management  systems,  and  knowledge-based 
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systems  (including  active  database  systems).  These  are  followed  by  discussion  of  various 
approaches  to  GIS,  including  proprietary  and  relational  GIS,  object-oriented  GIS,  and 
knowledge-based  GIS.  The  concluding  section  presents  a  synthesis  of  the  work  in  these 
fields  as  it  pertains  to  the  current  research. 

Large-Scale  Urban  Models 

Due  to  the  need  to  store,  relate,  and  manipulate  large  amounts  of  spatial,  temporal 
and  topical  data,  computers  have  been  used  to  support  geographic  applications  since  the 
1950s  (Budic  1994).  Large-scale  optimization,  econometric,  and  simulation  models  of 
urban  and  regional  systems  were  developed  by  the  1960s,  but  these  began  to  lose  favor  in 
the  U.S.  by  the  early  1970s  (Klosterman  1994).  Lee  voiced  specific  and  influential 
concerns  in  1973  which  are  important  to  keep  in  mind  as  we  look  at  the  various  modeling 
approaches  in  this  review.  He  referred  to  these  as  the  "seven  sins  of  large-scale  models" 
(Klosterman  1994,  p.  4):  ( 1 )  hypercomprehensiveness,  or  trying  to  serve  too  many 
purposes  at  once;  (2)  grossness,  providing  information  too  coarse  to  be  useful; 
(3)  hungriness,  requiring  enormous  amounts  of  data  (the  management  of  which  is  error 
prone  in  itself);  (4)  wrongheadedness,  that  the  models  suffered  from  "substantial  and 
largely  unrecognized  deviations  between  the  behavior  claimed  for  them  and  the  variables 

and  equations  that  actually  determined  their  behavior"  (Klosterman  1994,  p.  4);3 

(5)  complicatedness,  that  the  models'  complexity  and  internally-generated  errors  resulted 


3.  Klosterman  writes  ( 1 994.  p.  4):  "As  an  example,  Lee  points  out  that  data  for  an  entire 
metropolitan  area  were  often  used  to  derive  model  parameters  that  were  then  applied  to 
specific  neighborhoods  --  a  computerized  version  of  the  ecological  fallacy." 
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in  the  need  to  "massage"  the  models  to  produce  reasonable-looking  output; 
(6)  mechanicalness,  that  the  models  could  produce  large,  unknowable  errors  due  to 
iteration  and  rounding;  and  (7)  expensiveness,  that  the  models'  costs  were  often  so  high  as 
to  require  large  federal  grants  just  to  put  to  use.  As  we  will  see  from  experiences  with 
other  approaches,  these  issues  are  not  unique  to  urban  models. 

Object-Oriented  Programming  Systems 

Starting  in  the  late  1 960s  and  continuing  into  the  early  1 980s,  compact  engineering 
workstations  with  the  first  windowed  graphical  user  interfaces  (GUIs)  were  being 
developed  at  the  Xerox  Palo  Alto  Research  Center  (PARC).  A  new  operating  system 
called  Smalltalk  was  among  those  being  developed  to  take  advantage  of  these  advanced 
processing  architectures  (Goldberg  1988).  Starting  with  a  heritage  from  the  Simula 
programming  language,  Smalltalk  was  a  research  project  of  PARC's  Learning  Research 
Group  (later  called  the  Software  Concepts  Group)  to  work  toward  "a  vision  of  the  ways 
different  people  might  effectively  and  joyfully  use  computing  power"  (Goldberg  and 
Robson  1983,  p.  vii).  While  it  has  long  since  surrendered  its  role  as  a  complete  operating 

system4  it  has  retained  many  features  from  that  legacy5  and  remains  one  of  the  most 
powerful  and  extensible  programming  languages  and  software  development  environments 
today.  It  has  also  been  the  principal  catalyst  in  the  widespread  development  of  the  object- 

4.  One  of  the  first  Macintosh  operating  systems  was  based  on  Smalltalk,  and  an  early  model  of 
Sun  workstation  was  able  to  boot  up  under  Smalltalk,  but  no  more.  However,  one  or  more 
competing  versions  of  Smalltalk  now  run  on  UNIX,  IBM  MVS,  AS/400,  OS/2,  MS  DOS, 
MS  Windows,  and  Macintosh  platforms.  Some  versions  such  as  ParcPlace  VisualWorks 
support  cross-platform  portability;  i.e.,  the  same  program  can  be  run  on  any  of  the  supported 
platforms  without  recompiling,  regardless  which  computer  system  was  used  to  create  it. 
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oriented  (00)  paradigm  and  object-oriented  programming  systems  (OOPS).  These 
humorous  acronyms  may  have  been  no  accident;  early  Smalltalk  developers  tell  stories  of 
trying  to  work  on  experimental  workstations  with  a  MTBF  (mean  time  between  failures) 

of  about  twenty  minutes.6  However,  some  of  Smalltalk's  best  features  as  a  software 

7  8 

development  environment  emerged  in  direct  response  to  this  rather  hostile  environment. 

A  number  of  very  good  books  are  available  on  programming  with  Smalltalk 
(LaLonde  1994;  Howard  1995;  Smith  1991,  1995;  Lorenz  1995).  While  the  early  years  of 
Smalltalk  were  focussed  on  defining  the  language  and  convincing  the  software  industry  of 
the  value  of  the  object-oriented  paradigm,  attention  has  shifted  since  the  late- 1980s  to 
refining  the  methods  used  for  analysis  and  design  of  object-oriented  programs.  Numerous 
approaches  have  been  presented;  of  these  I  have  preferred  Booch  and  Rumbaugh's  work 
which  essentially  integrates  a  number  of  distinct  techniques,  each  of  which  is  more  or  less 
suited  to  different  specific  stages  in  the  software  development  life  cycle  (Rumbaugh  et  al. 
1991;  Booch  1994;  Booch  and  Rumbaugh  1995).  In  just  the  last  two  years,  another  focus 
of  attention  has  been  on  the  study  and  use  of  design  patterns  in  object-oriented 

5.  Smalltalk  helped  pioneer  the  use  of  lightweight  process  threads.  It  also  incorporates  the  use 
of  semaphores  and  non-preemptive,  priority-based  process  scheduling.  It  includes  a  number 
of  other  advanced  programming  features  as  well;  see  (Goldberg  and  Robson  1983,  1989; 
ParcPlace  Systems  1994a,  b). 

6.  Personal  communication  with  Russ  Pencin,  ParcPlace  Systems,  1989. 

7.  References  to  "Smalltalk"  as  a  language  as  well  as  a  programming  environment  may  seem 
confusing  at  first,  but  I  will  try  to  distinguish  these  different  usages  by  context.  In  fact, 
Smalltalk  represents  at  times  a  program  organization  philosophy;  a  language  with  rules  of 
syntax  and  semantics;  and  an  interactive,  graphical,  software  development  environment  with 
a  rich  set  of  tools  for  developing,  cross-referencing,  debugging,  versioning  and  documenting 
programs.  Most  of  the  tools  for  building  Smalltalk  programs  are  also  written  in  Smalltalk, 
forming  an  inherently  user-extensible  language  and  development  environment. 


22 

programming  (Gamma  et  al.  1995;  Coplien  and  Schmidt  1995).  These  are  based  on  the 
work  of  architect  and  professor  Christopher  Alexander  in  his  study  of  design  patterns  in 
urban  and  natural  architecture  (Alexander  et  al.  1977).  Design  patterns  provide  a  very 
concise  vocabulary  for  discussing  object-oriented  programming  design  constructs,  which 
also  serves  to  aid  in  documenting  a  program.  These  are  used  only  slightly  in  this  thesis  due 
to  their  recent  appearance  in  the  literature.  Given  time  and  resources,  I  would  like  to 
review  OVPF  again  with  the  goal  of  identifying  the  specific  design  patterns  which  occur 
in  the  program. 

Finally,  a  very  useful  book  has  recently  appeared  which  is  directed  to  supporting 
"technical  managers  in  organizations  to  be  successful  in  the  use  of  object-oriented 
technology"  (Goldberg  and  Rubin  1995,  p.  v).  This  book  distills  decades  of  experience 
with  Smalltalk  and  other  object-oriented  systems  to  address  the  many  issues  of  effective 
project  management. 

Relational  and  Object-Oriented  Database  Management  Systems 

In  the  late  1970s  and  early  1980s,  relational  database  management  systems 
(RDBMS)  came  out  of  the  research  laboratories  and  began  to  find  general  commercial  use 

in  corporate  minicomputer-based9  information  systems  applications  such  as  finance  and 
accounting.  It  was  also  about  this  time  that  personal  computers  (PCs)  began  to  enter  the 


8.  One  of  the  earliest  Smalltalk  development  utilities  from  ParcPlace  Systems  was  a  disk-based 
audit  trail  called  the  "change  list"  of  all  programming  code  as  it  was  written,  along  with  a 
crash  recovery  tool  to  roll-forward  changes  made  by  the  programmer  since  the  last  saved 
version  of  the  complete  program.  This  soon  evolved  into  a  facility  to  support  merging 
multiple  programmers'  code. 
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office.  Both  of  these  technologies  caught  the  interest  and  budgets  of  planners  and  other 
users  of  GIS.  A  fairly  thorough  guide  to  concepts  and  issues  of  RDBMS  may  be  found  in 
Date  (1995),  with  additional  perspectives  provided  in  Stonebraker  (1988). 

An  important  complementary  technology  to  RDBMS  was  the  development  of 
object-oriented  database  management  systems  (ODBMS)  by  the  middle-  to  late- 1980s 
(see  Zdonik  and  Maier  1990).  Initially  these  were  created  to  meet  the  needs  of  complex 
applications  such  as  computer-aided  drawing,  engineering  or  manufacturing  (CAD,  CAE 
and  CAM),  which  have  traditionally  not  enjoyed  the  relational  database  model. 
Companies  offering  commercial  ODBMS  products  with  Smalltalk  interfaces  include 
GemStone,  Versant,  ObjectStore,  Objectivity,  and  UniSQL.  We  chose  GemStone 
(GemStone  Systems  Inc.  1995)  and  ObjectStore  (Object  Design  Inc.  1995)  for  evaluation 
in  the  current  research  project  because  they  offer  different  client-server  architectures,  and 
we  did  not  have  the  resources  to  examine  more  than  two  of  these.  For  more  information  on 
client-server  issues  among  ODBMS  architectures,  see  (DeWitt  et  al.  1992;  Cobb  et  al. 
1995a).  Another  interesting  paper  outlines  its  author's  proposed  "object-oriented  database 
system  manifesto"  of  issues  that  need  to  be  properly  addressed  when  working  within  the 
object-oriented  paradigm  (Atkinson  et  al.  1992). 

While  the  entire  ODBMS  market  today  is  probably  smaller  than  any  one  of  the 
major  RDBMS  companies'  customer  lists,  its  influence  is  being  felt.  Many  corporate 
information  systems  managers  are  switching  to  ODBMS  for  standard  business 

9.  "Minicomputers"  fill  a  middle  ground  between  personal  workstations  and  mainframes  for 
multi-user  systems.  There  are  hundreds  of  models  in  wide  use  today  by  Sun,  IBM,  HR  DEC, 
and  many  others.  Minis  and  mainframes  have  become  smaller  as  workstations  have  become 
more  powerful,  to  the  point  that  these  distinctions  are  sometimes  hard  to  make  now. 
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applications  that  were  traditionally  based  on  RDBMS.  And  most  major  RDBMS  software 
companies  now  have  a  strategy  in  place  for  supporting  ODBMS  applications  in  the 
present  or  near  future. 

Knowledge-Based  Systems 

Another  field  of  technological  development  which  has  a  bearing  on  my  research 
started  in  the  1960s  with  artificial  intelligence  (AI)  and  expert  systems  (ES).  There  was 
considerable  initial  excitement  over  the  possibility  of  capturing  the  reasoning  and 
heuristics  (rules  of  thumb)  of  experts  in  a  complex  problem  domain  for  use  in  computer 
models.  This  excitement  cooled  by  the  early  1970s  due  to  the  failure  of  the  technology  to 
follow  through  on  its  early  hype  and  promise,  but  the  field  seemed  to  be  saved  from 
demise  by  the  introduction  of  microcomputers.  By  the  early  1980s,  through  the  massive 
dissemination  of  affordable  machines  capable  of  meeting  the  heavy  computational 
requirements  of  AI,  "expert  systems  were  even  appearing  as  part  of  the  most  basic 
educational  software  packages"  (Batty  and  Yeh  1991,  pp.  103).  Numerous  proposals  and 
case  studies  regarding  the  use  of  expert  systems  in  non-GIS  urban  and  environmental 
applications  have  appeared  in  the  literature  since  the  mid-1980s  (Dickey  et  al.  1986; 
Ortolano  and  Perman  1987;  Davis  et  al.  1987;  Sharpe  et  al.  1991;  Heikkila  and  Blewett 
1992).  Special  issues  of  industry  journals  have  been  devoted  to  expert  systems  in  urban 
and  environmental  planning  and  design  (Sharpe  et  al.  1987;  Batty  and  Yeh  1991). 
Collections  of  case  studies  covering  a  wide  range  of  applications  may  be  found  in  (Kim  et 
al.  1 990;  Wright  et  al.  1 993).  Leung  ( 1 988)  provides  a  theoretical  foundation  for  the  use  of 
"fuzzy  sets"  to  represent  imprecision  in  spatial  analysis  and  planning. 
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So  what  are  expert  systems?  A  good  way  of  describing  them  is  as  .  .  . 

.  .  .  decision  aids  which  represent  knowledge  about  the  problem  domain  in  terms  of  rule- 
based  structures.  As  such,  they  are  models  of  the  problem-solving  process  which  enable 
conditional  syllogisms  in  the  IF-THEN  form  to  be  executed  in  sequence.  In  fact,  the 
problem  domain  is  usually  represented  by  a  network  of  such  rules  and  the  expert  system 
processes  these  rules  by  searching  this  network  to  find  the  ultimate  conclusions  or  the 
original  premeises  which  represent  the  basic  outputs  and  inputs  which  drive  the  system. 
These  systems  are  organised  into  a  ...  knowledge  base  which  contains  the  data  in  a  form 
which  can  be  operated  upon  by  the  system's  inference  engine  which  contains  the  search 
procedures.  Searching  is  usually  accomplished  by  forward  chaining  from  premise  to 
conclusion  or  backward  chaining  from  conclusion  to  premise.  (Batty  and  Yeh  1991, 
p.  103) 

In  the  design  of  expert  systems  frameworks,  technology  has  branched  in  three 
main  directions:  (1)  pure  production-rule  systems  such  as  OPS5  (Brownston  et  al.  1985); 

(2)  first-order  logic  systems  such  as  Prolog  and  its  derivatives  (Torsun  1995);  and 

(3)  active  databases  (Chakravarthy  1992;  Jaeger  and  Freytag  1995;  Widom  and  Ceri 
1996).  Each  of  these  will  be  described  briefly,  even  though  the  first  two  were  found  to  be 
unsuitable  for  the  problem  domain  in  this  research  project.  Lessons  are  gained  from  all 
three  approaches. 

Production-rule  systems 

The  basic  architecture  of  production-rule  systems,  sometimes  called  simply 
"production  systems,"  consists  of  three  main  components:  ( 1)  a  data  store  or  working 
memory,  containing  a  global  database  of  symbols  representing  facts  and  assertions  about 
the  problem;  (2)  a  set  of  rules,  which  constitutes  the  program,  stored  in  production 
memory  or  rule  memory;  and  (3)  an  inference  engine  to  execute  the  rules.  Rules  have  two 
parts:  a  condition  to  be  tested,  and  an  action  to  execute  if  the  condition  proves  to  be  true 
(Brownston  et  al.  1985,  pp.  6-7).  Both  forward  chaining  and  backward  chaining  are 
supported.  Production  systems  proceed  computationally  by  examining  and  matching  the 
states  of  all  the  data  against  all  the  rule  conditions  in  each  program  cycle.  This  is  well 
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suited  to  applications  in  which  the  program  must  respond  adaptively  to  frequent, 
unpredictable  changes  in  its  environment.  Unfortunately  for  our  case,  there  is  no  means  of 
supporting  interactions  or  sequencing  among  rules.  This  would  not  work  well  with  data 
models  of  interdependent  geographic  features  in  which  the  order  of  rule  processing  is 
often  important,  such  as  in  facilities  management  applications  for  utilities.  Also,  the 
considerable  overhead  involved  in  examining  the  data  store  and  the  rule  base  in  each 
programming  cycle  is  an  unnecessary  price  to  pay  "when  efficient  and  provably  correct 
algorithms  or  even  close  approximation  algorithms  exist  for  a  task.  ...  In  general,  if  the 
problem  and  the  solution  to  the  problem  are  well  structured  or  highly  structured,  it  is 
unlikely  that  the  best  computer  representation  to  the  problem  will  be  a  production-system 
program."  (Brownston  et  al.  1985,  p.  26)  In  the  current  project,  our  problems  and  solutions 
tend  to  be  highly  structured. 
First-order  logic  systems 

First-order  (also  called  predicate)  logic  programming  approaches  such  as  Prolog 
introduce  formal  semantics  and  provable  correctness  of  theorems  as  the  means  of  solving 
problems.  These  are  generally  backward-chaining  systems,  in  which  the  system  seeks  to 
determine  the  premise  to  a  given  conclusion  through  exhaustive  proofs  of  applicable 
theorems  which  could  apply  to  the  resolution  of  a  given  inference  rule.  As  with  production 
systems,  these  have  significant  shortcomings  for  dealing  with  very  large  databases.  As 
Torsun  writes  (1995,  p.  455):  "the  use  of  logic  programming  routinely  in  industrial/ 
commercial  applications  is  severely  hindered  by  a  serious  drawback.  This  drawback  is  the 
inefficiency  of  logic  languages  in  applications  where  the  problem  is  complex,  large,  or 
both.  .  .  .  logic  programming  is  domain  independent  and  the  search  methods  are 
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undirected,  but  for  efficiency  to  be  achieved,  proof  servers  need  to  be  more  focused." 
Another  serious  shortcoming  of  logic  programming  is  that  these  languages  do  not  include 
tools  for  building  sophisticated  windowed  GUIs  capable  of  managing  tens  of  thousands  of 
points  and  vectors  at  once,  so  would  have  to  be  somehow  integrated  with  a  GUI  toolkit  for 
this  functionality. 

An  application  area  where  predicate  logic  appears  to  be  well  suited  is  that  of 

programming  code  generation.10  This  is  an  important  area  offering  increased  productivity 
of  programming  effort  in  certain  application  domains.  In  this  case,  the  problem  domain  is 
limited  to  the  syntax  and  semantics  of  the  input  scripting  language  and  of  the  generated 
output  code,  as  far  as  the  logic  system  is  concerned.  However,  in  facilities  management, 
land-use  zoning  decision  making,  and  many  other  applications  of  GIS,  the  bounds  and 
semantics  of  the  problem  domain  are  too  broad  and  complex  to  fit  the  limitations  of  logic 
systems. 

Active  databases 

The  field  of  active  databases  has  emerged  since  the  mid-  to  late- 1980s  as  a  very 
promising  technology.  "Active  database  systems  are  able  to  recognize  specific  situations 
(in  the  database  and  beyond)  and  to  react  to  them  without  direct  explicit  user  or 
application  requests."  (Gatziu  and  Dittrich  1992,  p.  23)  This  represents  in  some  ways  an 
extension  of  the  traditional  "passive"  database  management  system,  and  in  some  ways  an 
extension  of  the  OPS5  production-rule  system.  Active  databases  are  superior  to  passive 
databases  for  enforcing  general  integrity  constraints  and  enabling  triggers,  as  well  as  for 


10.  Personal  communication  with  Dr.  Sharma  Chakravarthy. 
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supporting  data-intensive  expert  systems  and  workflow  management  applications,  since 
the  rule  base  does  not  have  to  fit  completely  in  memory  (Widom  and  Ceri  1996). 

Common  to  most  active  database  systems  are  the  notions  of  events  (or  situations) 
and  actions,  associated  via  rules,  as  in  SAMOS  (Gatziu  and  Dittrich  1992).  This  is  often 
referred  to  as  an  "ER"  (for  event-rule)  framework.  An  event  might  be  the  creation, 
modification  or  removal  of  (in  our  case)  a  geographic  feature  object  or  a  graphical 
primitive  object.  A  rule  might  associate  the  removal  of  an  object  with  an  action  to  check 
the  user's  authorization  privileges  before  allowing  the  event  to  proceed.  Another  rule 
might  associate  the  creation  of  a  new  feature  object  with  an  action  to  check  and  enforce 
the  data  integrity  constraints  for  that  feature's  location  or  attributes. 

An  extension  of  this  approach  developed  for  the  HiPAC  project  (Chakravarthy 
et  al.  1989;  Dayal  et  al.  1996)  and  used  in  Snoop  (Chakravarthy  and  Mishra  1993)  adds 
the  notion  of  being  able  to  check  arbitrary  conditions,  potentially  having  to  do  with  objects 
not  related  to  the  triggering  event,  before  firing  the  associated  rule's  action.  This  is  often 
referred  to  as  an  "ECA"  (for  event-condition-action)  framework.  For  example  with  this 
approach,  we  might  condition  the  insertion  of  a  new  bridge  feature  object  to  depend  on  the 
prior  existence  of  a  nearby  road  feature  with  which  it  can  establish  application-dependent 
associations.  This  kind  of  rule-encoded  interdependencies  among  geographic  features 
would  be  very  useful  in  facilities  management  and  other  complex  GIS  applications. 

Active  databases  can  be  based  on  either  relational  or  object-oriented  database 
models,  and  depending  on  their  design,  can  support  forward  chaining,  backward  chaining, 
or  both.  Most  of  the  earlier  research  and  commercial  database  products  applied  reactive 
capability  to  RDBMS  (Chakravarthy  et  al.  1989;  Stonebraker  et  al.  1988;  Widom  and 
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Finkelstein  1990;  Darnovsky  and  Bowman  1990;  InterBase  1990).  More  recently,  others 
have  attempted  to  incorporate  event  and  rule  support  into  an  ODBMS  (Gehani  et  al.  1992; 
Gehani  and  Jagadish  1991;  Diaz  et  al.  1991;  Chakravarthy  et  al.  1993;  Medeiros  and 
Pfeffer  1990;  Su  et  al.  1989;  Anwar  1992). 

Anwar  et  al.  (1993)  examine  the  implications  of  the  shift  from  a  relational  to  an 
object-oriented  DBMS,  and  point  out  the  greater  flexibility  from  using  an  ODBMS  for  the 
active  database.  In  particular,  it  is  noted:  "In  contrast  to  a  fixed  number  of  pre-defined 
primitive  events  in  the  relational  model,  every  method/message  is  a  potential  event" 
(p.  99).  This  is  an  important  distinction.  For  example,  a  trigger  in  a  typical  RDBMS  might 
be  set  up  to  take  effect  on  update  of  a  record  in  a  given  table,  but  it  is  not  possible  to 
trigger  only  on  update  of  a  certain  field  in  the  table;  the  trigger  will  always  take  effect  for 

an  update  to  the  record  no  matter  which  field  was  the  one  updated.11  In  an  ODBMS,  it  is 
possible  to  have  triggers  defined  at  any  granularity  of  an  object's  structure.  Another 
finding  from  Anwar  et  al.  (1993)  is  that  by  appropriate  specification,  parameterized  rules 
can  be  associated  with  either  a  class  object  (in  which  case  the  rule  would  be  in  effect  for 
all  instances  of  the  class)  or  for  an  individual  instance. 

The  Snoop  model  introduced  the  notion  of  complex  events,  which  could  be 
defined  either  as  a  sequence  of  specific  primitive  events,  or  as  a  Boolean  composite  of 
multiple  primitive  events.  Taken  together,  these  form  a  surprisingly  simple  and  powerful 
set  of  constructs,  which  are  the  basis  of  the  event  system  now  in  our  OVPF  application. 


11.  One  exception  is  Sybase.  This  RDBMS  is  capable  of  limiting  an  update  trigger  to  only  fire 
on  change  to  a  specified  field. 
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The  point  of  building  active  functionality  like  this  into  the  database  system  itself  is 
to  ensure  consistent  usage  and  high  performance.  SAMOS  (Gatziu  and  Dittrich  1992)  is 
an  example  of  an  active  database  layer  implemented  on  top  of  ObjectStore,  the  same 
ODBMS  we  are  using  for  our  OOGIS  repository.  It  was  found  in  the  SAMOS  project  that 
some  performance  is  inevitably  lost  when  the  active  capability  is  added  on  top  of  the 
ODBMS  rather  than  being  built  into  the  kernel  from  the  beginning.  The  OVPF  application 
will  share  this  fate,  but  for  the  present  we  are  only  concerned  with  prototyping  advanced 
capabilities,  and  found  this  an  acceptable  trade-off,  given  there  are  no  commercially 
available  ODBMSs  with  reactive  capability. 

Geographical  Information  Systems 

There  are  now  numerous  textbooks  and  references  on  GIS.  Two  of  the  more 
comprehensive  books  with  which  I  am  familiar  are  (Maguire  et  al.  1991)  and  (Laurini  and 
Thompson  1992).  These  discuss  the  key  issues  and  current  approaches  for  creating  and 
using  geographic  databases.  A  useful  introductory  guide  to  GIS  also  seems  to  be  (Garson 
and  Biggs  1992). 

Briefly,  a  GIS  provides  (in  varying  levels  of  quality  and  ease  of  use,  according  to 
the  system's  manufacturer):  ( 1 )  a  database  of  graphical,  locational  information  for  a  set  of 
geographic  features;  (2)  a  synchronized  database  of  nonspatial  attributes  for  the  same  set 
of  geographic  features;  (3)  a  graphical  user  interface  (GUI)  with  query  and  update 
capabilities  allowing  a  user  to  access  and  modify  the  feature  data;  and  (4)  analytical 
capabilities  allowing  the  user  to  conduct  studies  taking  advantage  of  the  geometrical  or 
topological  properties  of  the  geographic  data.  Some  examples  of  spatial  analysis 
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supported  by  GIS  that  were  previously  not  feasible  include  "estimating  runoff  volume  in 
specific  areas,  locating  areas  with  scenic  amenity,  and  searching  for  paths  through  three- 
dimensional  space  that  satisfy  certain  conditions,  such  as  minimizing  distance  or 
construction  costs  or  avoiding  major  obstacles."  (Han  and  Kim  1989,  p.  298)  A  number  of 
publications  have  appeared  which  enumerate  the  functionality  which  should  or  could  be 
found  in  a  GIS  (for  example,  see  Goodchild  1988,  1994;  Tomlin  1990).  In  our  OVPF 
project,  we  have  not  yet  reached  the  stage  of  implementing  analytical  functions  such  as 
these;  hence  we  usually  refer  to  OVPF  simply  as  a  viewer/editor. 

The  data  models  used,  and  much  of  the  analysis  performed  with  vector-based  GISs 
depends  on  graph  theory  (Harary  1969)  and  planar  topology  (Alexandrov  1957,  1965; 
Munkres  1966;  Spanier  1966;  Simmons  1963).  For  the  reader  interesting  in  probing  the 

mathematics  of  topology  in  more  depth  (according  to  one  well-informed  source  ),  "the 
standard  reference  is  Alexandrov  (1965).  More  mathematical,  and  a  fine  book,  is  Munkres 
(1966).  THE  reference  for  the  insider  is  Spanier  (1966)."  However,  it  is  unnecessary  for 
our  purposes  to  explore  the  range  of  methods  for  representing  topology,  as  the  VPF 
specification  defines  a  particular  manner  in  which  primitive  spatial  objects  shall  be 
represented  and  associated  with  each  other.  The  VPF  "winged-edge  topology"  model 
(DMA  1993a,  Appendix  B)  is  a  form  of  the  point-line-polygon  model  typical  in  existing 
GIS  systems  (Worboys  1994).  Our  OVPF  application  provides  complete  support  for  the 
VPF  winged-edge  topology  model,  as  will  be  described  in  the  Materials  and  Methods 
section. 


12.  Personal  communication  with  Dr.  Max  Egenhofer. 
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Another  important  aspect  of  a  GIS  is  the  choice  of  spatial  indexing  algorithm. 
While  VPF  specified  a  particular  spatial  index  approach  (the  adaptive  binary  tree),  we  felt 
less  bound  to  follow  this  guideline,  for  a  couple  reasons:  (1)  the  adaptive  binary  tree  could 
not  perform  as  well  as  many  other  structures;  and  (2)  the  choice  of  spatial  index  is  critical 
to  the  overall  performance  of  the  system  for  queries  and  analysis.  The  version  of  the 
Objective  Facilities  Management  (OFM)  program  from  which  OVPF  evolved,  used  a 
simple  quadtree  approach  based  on  (Samet  1994).  Other  popular  approaches  include  range 
trees  (e.g.,  R  tree,  R+  tree,  and  R*  tree;  see  also  Samet  1994;  Beckmann  et  al.  1990; 
Brinkhoff  et  al.  1993).  We  decided  on  the  quadtree  partly  because:  ( 1)  the  quadtree 
approach  yields  unique  spatial  index  keys,  whereas  the  range  tree  approach  does  not  (this 
would  be  important  for  storing  spatial  index  keys  on  disk  for  later  use);  and  (2)  it  was 
already  implemented  in  OFM,  and  had  reasonable  performance  with  our  prototype.  One 
problem  of  range  trees  occurs  in  the  case  of  overlapping  regions  for  a  given  spatial  object: 
it  is  not  possible  to  determine  a  unique  and  repeatable  index  key  for  the  spatial  object.  The 
VPF  specification  however,  provided  for  storing  the  spatial  index  key  values  as  part  of  the 
georelational  file  structure,  for  the  purpose  of  allowing  faster  access  to  features.  The  use 
of  range  trees  would  preclude  our  ability  to  store  a  repeatable  spatial  index  key  with  a 
given  feature  object  in  the  VPF  file  structure. 

Object-Oriented  GIS 

Since  1987,  numerous  object-oriented  approaches  and  data  models  for  GIS  have 
been  proposed  and  examined  in  the  research  literature  (Egenhofer  and  Frank  1987; 
Dueker  and  Kjerne  1987;  Abel  1989;  Egenhofer  and  Frank  1992;  Herring  1992;  Worboys 


1994).  It  is  noted  in  Egenhofer  and  Frank  (1992,  p.  16)  that  "Object-oriented 
programming  languages  will  be  needed  to  implement  the  future  GIS  most  efficiently.  . .  . 
because  it  naturally  supports  the  treatment  of  complex,  in  this  case  geometric,  objects 
(Kjerne  and  Dueker  1990).  Compared  with  conventional  data  models,  an  object-oriented 
design  is  more  flexible  and  better-suited  to  describe  complex  data  structures."  With  regard 
to  ODBMS,  the  same  authors  continue  (1989,  p.  16),  "By  using  a  database  management 
system,  data  are  treated  by  their  properties;  the  object-oriented  approach  groups  these 
properties  into  possibly  complex  objects  and  corresponding  operations." 

Two  recent  doctoral  dissertations  have  been  directed  to  the  potential  for  using 
object-oriented  concepts  in  GIS  (Feuchtwanger  1993;  Karnes  1995).  Feuchtwanger 
proposes  a  geographic  semantic  database  model,  incorporating  notions  of  both  structural 
and  behavioral  aspects  of  stored  information.  Karnes  implements  a  Smalltalk-based 
prototype  for  modeling  land  parcel  networks  in  a  cadastral  cartography  application. 
Karnes'  work  is  especially  interesting,  as  he  explores  the  use  of  object-oriented 
programming  as  a  means  of  modeling  and  creating  novel  metaphors  for  real-world 
representations  in  cartographic  and  geographic  domains.  It  is  the  flexibility  of  object- 
oriented  technology  (and  Smalltalk's  development  environment)  in  supporting  complex 
representations  that  facilitates  this  application. 

Commercial  OOGIS  products  have  emerged  in  the  last  few  years,  including 
Arcview/ Avenue  from  Environmental  Systems  Research  Institute  (ESRI  1995a,  1995b), 
Magik  from  Smallworld  Systems  (Smallworld  1995),  Gothic  Application  Development 
Environment  (Gothic  ADE)  from  Laser-Scan  (LSL  1995a,  1995b),  and  of  course 
Objective  Facilities  Management  (OFM  1996). 


34 

Avenue  is  an  object-oriented  scripting  language  for  supporting  Arcview 
applications.  It  appears  to  draw  much  of  its  inspiration  from  both  Smalltalk  and  C++,  in 
terms  of  syntax  and  semantics.  However  it  has  some  serious  shortcomings  for  use  in  GIS: 
(1)  it  is  a  closed  system,  that  is,  the  user  may  not  create  new  classes  or  class  hierarchies, 
but  can  only  use  the  classes  provided  with  Avenue;  and  (2)  because  Arcview  is  not 
designed  or  intended  to  be  used  to  edit  Arc/Info  coverage  data,  Avenue  cannot  support 
editing  of  Arc/Info  coverages  either. 

Smallworld  Magik  is  more  powerful  in  some  ways  than  Arcview  with  Avenue, 
providing  a  full-featured  object-oriented  language  with  much  of  the  semantics  of 
Smalltalk.  The  user  can  create  classes  and  hierarchies  of  geographic  features,  and  can 
conduct  many  useful  analytical  operations  with  the  system.  Smallworld  provides  a 
proprietary  relational  database  system  for  both  spatial  and  nonspatial  data,  as  well  as 
supporting  access  to  Oracle  and  other  commercial  RDBMS  repositories.  Smallworld  has 
so  far  focussed  on  facilities  management  applications  for  electrical,  gas,  water  and 
telecommunications  industries. 

Laser-Scan  Gothic  ADE  is  also  a  powerful  object-oriented  GIS,  providing  both 
scripting  language  capability  and  a  proprietary  object-oriented  database  system  capable  of 
holding  spatial  and  nonspatial  geographic  data  on  the  order  of  terabytes  in  size.  Gothic 
ADE  uses  an  interesting  combination  of  C-language  libraries  and  a  high-level  scripting 
language  called  Lull,  to  achieve  what  they  claim  is  higher  performance  of  processing  than 
is  possible  with  Smallworld's  Magik  system.  Laser-Scan  has  so  far  concentrated  on  the 
market  for  large-scale  map  production  systems. 
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Expert  Systems  with  GIS 

There  has  been  considerable  progress  incorporating  geographic  data  into  urban 
and  regional  zoning  and  other  policy  formation  efforts  (Maguire  et  al.  1991;  Budic  1994). 
Research  and  practice  with  artificial  intelligence  (AI)  and  expert  systems  (ES)  in  recent 
years  has  resulted  in  the  proposal  and  development  of  several  models  for  supporting  urban 
and  regional  policy  studies  and  implementation  (see  Dickey  et  al.  1986;  Ortolano  and 
Perman  1987;  Davis  et  al.  1987;  Batty  and  Yeh  1991;  Sharpe  et  al.  1991;  Yan  et  al.  1991). 
However,  in  none  of  these  cases  was  GIS  data  directly  incorporated  into  the  design  or  use 
of  an  expert  system.  Furthermore,  some  of  the  experience  papers  draw  attention  to 
significant  difficulties  and  shortcomings  in  applying  AI  or  ES  technology  to  urban 
planning  applications  (  Dickey  et  al.  1986;  Sharpe  et  al.  1991).  Another  thoughtful  paper 
discusses  numerous  legal  and  ethical  issues  regarding  the  use  of  ES  in  planning  (Wigan 
1987). 

Other  research  has  focussed  on  applying  AI,  ES  and  DSS  approaches  to  work 
specifically  with  GIS  as  an  enabling  technology  for  spatial  queries  and  information 
analysis  (Peuquet  1987;  Taylor  1991;  Han  etal.  1991;  Webster  et  al.  1991;  Worboys  1994; 
Chen  et  al.  1994;  as  well  as  several  of  the  papers  from  Kim  et  al.  1990;  Wright  et  al.  1993). 

A  very  interesting  work  from  the  mid-1980s  was  KBGIS-II  (Smith  et  al.  1987). 
This  was  a  project  at  UC  Santa  Barbara  to  develop  a  knowledge-based  GIS  system,  which 
was  based  on  Common  Lisp,  Pascal  and  C.  It  included  a  means  of  representing  both  vector 
and  raster  (pixel-based)  data,  and  defined  a  spatial  object  language.  It  had:  ( 1)  a  query 
mode  supporting  a  simple  but  versatile  set  of  query  forms;  (2)  a  learn  mode  in  which  the 
system  could  modify  and  augment  its  knowledge  base;  (3)  an  edit  mode  in  which  the  user 
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could  modify  and  augment  the  spatial  object  language,  as  well  as  the  knowledge  base;  and 
(4)  a  trace  mode  in  which  the  user  could  follow  the  processing  steps  being  executed  by  the 
system.  This  work  represents  more  complete  development  in  terms  of  supporting  queries 
and  analysis  than  has  so  far  been  achieved  in  the  current  research.  However,  the  thrust  of 
my  thesis  is  toward  proof  of  concept  that  a  completely  object-oriented  approach  has  merit 
for  implementing  a  GIS  framework.  The  KBGIS-II  project  helps  inform  the  current 
research  of  some  aspects  of  the  overall  framework  that  should  receive  attention. 

With  LOBSTER,  Egenhofer  and  Frank  (1990)  present  an  interesting  approach  to 
building  a  Prolog-based  spatial  query  language,  resulting  in  progress  toward  a  high  level 
abstraction  of  spatial  data  and  geometric  operations,  but  note  some  significant  difficulties. 
For  instance,  "Prolog  contains  no  provisions  to  prevent  the  entry  of  invalid  or 
contradictory  data.  .  .  .  Such  errors  are  extremely  difficult  to  detect.  If  the  database 
contains  large  numbers  of  facts,  visual  inspection  by  browsing  is  not  possible  anymore." 
(p.  924)  Another  source  of  problems  was  that  "Some  Prolog  programs  rely  on  the  order  in 
which  facts  and  rules  are  entered  into  the  database."  Both  these  issues  relate  to  the 
difficulties  of  trying  to  apply  a  strictly  logic-based  approach  in  a  problem  domain 
requiring  more  procedural  control. 

Some  additional  insight  into  the  issues  and  difficulties  of  developing  expert 

systems  is  brought  out  by  Han  and  Kim  (1989).  In  this  paper  they  discuss  some  of  the 

distinctions  between  standard  database  management  systems  (DBMS)  and  decision 

support  systems  (DSS)  in  urban  planning  (p.  298): 

The  problems  dealt  with  by  DSS  are  generally  different  from  those  dealt  with  by  DBMS. 
DBMS  is  suited  for  structured  problems  that  have  a  standard  operational  procedure, 
decision  rules,  and  clear  output  formats,  such  as  those  used  in  identifying  low  income 
districts  or  in  determining  the  median  income  of  a  city.  DSS.  on  the  other  hand,  is  intended 
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for  unstructured  or  semistructured  problems,  such  as  estimating  fiscal  and  other  impacts  of 
land  development  proposals,  to  provide  quantitative  support  to  the  decision  maker. 

Han  and  Kim  go  on  to  inquire  as  to  the  reasons  why  urban  planning  complicates  the  use  of 
DSS  and  more  sophisticated  expert  systems.  Among  their  findings  is  a  list  of  suggested 
guidelines  for  identification  of  tasks  suited  to  expert  systems  approaches.  These  were 
accumulated  through  a  number  of  sources  (Han  and  Kim  1989,  p.  300): 

1 .  Genuine  experts  exist  who  can  articulate  their  problem  solving  methods; 

2.  Experts  agree  on  solutions; 

3.  The  task  is  not  poorly  understood; 

4.  The  problem  typically  takes  a  few  minutes  to  a  few  hours  to  solve; 

5.  No  controversy  over  problem  domain  rules  exists; 

6.  The  problem  is  clearly  specifiable  and  well-bounded;  and 

7.  The  problem  solving  should  be  judgmental  in  nature,  not  numerical. 
For  those  who  are  familiar  with  some  of  the  battlegrounds  in  urban  planning,  these 
conditions  will  seem  simplistic  and  naive.  Some  of  the  reasons  they  are  suggested  are  to 
support  repeatable  results  and  to  allow  objective  validation  of  solutions  found.  In  any  case 
we  must  start  somewhere,  and  progress  is  being  made.  It  is  part  of  my  goal  with  the 
current  research  to  contribute  to  this  progress. 

Importance  and  Contributions  of  This  Thesis 

We  have  now  looked  very  briefly  at  the  main  technological  threads  which  come 
together  in  the  current  research:  urban  system  modeling,  database  management,  GIS, 
object-oriented  programming,  and  knowledge-based  systems.  Thus  far,  all  major 
commercial  GIS  products  (both  relational  and  object-oriented)  except  Objective  Facilities 
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Management  (OFM)  have  their  own  proprietary  programming  or  scripting  language  and 
database  system  repository,  although  they  generally  support  one  or  more  of  the  major 
RDBMS  products  as  well.  It  is  certainly  understandable  why  this  would  be  so:  by 
controlling  the  language  with  which  a  GIS  user  accesses  and  modifies  the  data,  the  GIS 
software  manufacturer  has  fewer  problems  to  cope  with  in  system  development  and 
integration,  as  the  products  and  users'  applications  grow  and  change. 

However,  it  is  my  perspective  that  this  approach  inhibits  users'  ability  to  develop 
innovative  solutions  to  meet  their  needs,  and  greatly  limits  the  number  of  trained 
programmers  in  the  marketplace  who  might  have  experience  with  a  given  GIS  product. 
Considerable  talent  and  effort  has  been  directed  to  the  development  of  each  major 
programming  language  such  as  Cobol,  Fortran,  Pascal,  C,  C++,  Smalltalk  and  others.  The 
advances  being  made  yearly  with  Smalltalk,  C++,  Java,  and  other  emerging  systems  are 
almost  staggering.  Similarly,  RDBMS  and  ODBMS  each  represent  very  significant  areas 
of  intense  research  and  development  in  their  own  rights,  independent  of  the  applications 
for  which  they  are  used.  It  is  inconceivable  that  any  one  of  the  software  manufacturers  in 
the  GIS  field  can  compete  with  the  functionality,  robustness,  interoperability  on  different 
computer  platforms,  and  tools  for  development  and  debugging  that  are  now  expected  of 
most  modern  programming  languages  and  database  systems.  Nor  can  the  GIS  industry 
easily  tap  into  the  larger  workforce  of  experienced  programmers  and  consultants  using 
these  other  languages  and  systems.  It  is  my  perspective  that  systems  such  as  OFM  and 
OVPF  represent  the  kind  of  approach  which  can  combine  the  capabilities  needed  in  a  GIS 
with  the  strengths  and  other  advantages  provided  by  using  industry-standard  programming 
languages  and  ODBMS.  While  OVPF  represents  a  proof-of-concept  at  this  stage, 
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exhibiting  no  particular  spatial  analysis  functionality,  that  kind  of  capability  can  be 
implemented  with  Smalltalk  or  another  language,  and  integrated  closely  with  the  geo-data 
handling  capabilities  already  present  in  OVPF. 

In  addition  to  addressing  the  more  traditional  aspects  of  GIS,  the  current  research 
provides  the  essential  framework  for  supporting  expert  systems  applications,  without 
having  to  carry  along  the  significant  overhead  of  a  complete  "expert  system  shell."  This  is 
done  through  implementation  of  a  simple,  elegant  and  extensible  rule-based  framework 
and  event-detection  mechanism  in  Smalltalk,  as  part  of  the  core  functionality  for  creating 
and  modifying  geographic  features.  Because  of  certain  features  in  Smalltalk  such  as 
dynamic  binding  (Goldberg  and  Robson  1989),  it  is  quite  straightforward  to  design  an 
application  which  can  modify  its  own  structure  and  behavior  at  runtime,  at  the  user's 
request.  Such  a  system  can  also  be  designed  to  be  capable  of  adding,  modifying,  and 
removing  rules  based  on  input  from  multiple  simultaneous  users  in  real  time.  This  is  a 
powerful  capability  that  could  conceivably  lead  to  development  of  expert  systems  which 
can  learn  and  adapt  to  changing  conditions-necessary  functionality  for  use  in  increasingly 
complex  urban  planning  activities. 


MATERIALS  AND  METHODS 


This  section  is  in  two  parts.  The  first  part  describes  the  software  environment 
which  was  used  to  conduct  the  programming  (the  "materials"),  and  the  second  part 
describes  the  various  issues  encountered  and  approaches  used  to  carry  out  the 
programming  tasks. 

Object-Oriented  Software  Development  Tools 

The  development  of  this  GIS  framework  was  greatly  facilitated  by  access  to 
excellent  tools:  the  Smalltalk  development  environment,  a  source  code  management 
system,  an  object-oriented  database  management  system  (ODBMS),  and  of  course  the 
computer  platform  itself.  These  will  each  be  described  briefly  below. 

Smalltalk  Programming  Environment 

Smalltalk  was  chosen  as  the  development  platform  for  the  current  project  (OVPF) 
initially  because  that  was  the  language  used  for  Objective  Facilities  Management  (OFM), 
its  "parent"  program.  However,  there  are  many  reasons  for  its  use  in  OFM  and  its 
continuance  in  OVPF:  its  rich  development  and  debugging  environment,  extensible 
nature,  hooks  to  commercially  available  ODBMSs,  and  scaleability  for  working  with  both 
small  and  large  geographic  data  sets  (these  varied  from  15MB  to  over  300MB  for  each 
complete  Vector  Product  Format  source  database). 
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The  specific  version  of  Smalltalk  chosen  was  VisualWorks  from  ParcPlace- 

Digitalk,  Inc.1  This  product  included  the  Smalltalk  language  editor,  compiler,  user 
interface  building  tools,  cross-reference  system  for  variables,  objects  and  methods,  and 
various  browsers  (rather  like  having  an  encyclopedia  of  the  program  built  for  the 
programmer  by  the  system),  all  integrated  with  a  graphical  interface.  The  browsers 
provide  lookup  capability  for  ( 1)  all  methods  that  send  a  given  message;  (2)  all  methods 
that  implement  a  given  message;  and  (3)  all  references  to  a  given  instance  variable,  class 
variable,  class-instance  variable,  or  global  object  such  as  a  class  itself.  The  runtime 
debugger  supports  examination  of  the  process  stack  of  currently-active  methods  at  any 
point  in  time.  The  debugger  also  allows  the  user  to  edit  and  recompile  a  method,  then 
continue  execution  of  the  current  process  stack  from  the  recompiled  method,  without 
having  to  stop  and  restart  the  program.  These  were  invaluable  tools  throughout  the 
development  of  OVPF. 

Source  Code  Configuration  Management  Facility 

In  addition  to  VisualWorks,  we  acquired  licenses  for  ENVY/Developer  by  Object 
Technology  International  (OTI)  of  Ottawa,  Canada  (the  license  is  purchased  through 
ParcPlace).  This  is  a  sophisticated  source  code  versioning  and  configuration  management 
facility  which  has  been  developed  to  support  all  the  major  brands  of  Smalltalk  (including 
IBM,  Digitalk,  and  Enfin,  besides  ParcPlace).  ENVY  supports  team  programming  by 

1 .  The  company  was  called  ParcPlace  Systems  Inc.  through  most  of  the  duration  of  this 
project.  Digitalk  Inc.  was  ParcPlace  Systems'  chief  competitor  until  they  merged  in 
August  1995.  References  to  their  separate  products  in  this  thesis  are  now  obsolete, 
but  will  be  made  nevertheless. 
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allowing  multiple  programmers  to  share  a  common  library  of  Smalltalk  source  code.  The 
library  management  and  security  is  seamlessly  integrated  into  the  programming  editor, 
compiler  and  browsers,  precluding  the  need  for  the  user  to  always  remember  to  follow 
proper  library  checkout/checkin  procedures  as  is  typical  with  other  programming  source 
code  managers.  Multiple  programmers  can  even  divide  and  work  on  different  portions  of 
the  same  object  class  without  conflict.  This  facility  was  critical  for  the  management  of 
OVPFs  hundreds  of  classes  and  thousands  of  methods,  developed  at  an  intense  pace 
during  the  two  years,  with  geographic  separation  among  the  team  members.  ENVY 
consists  of  two  modules:  one  for  the  server  computer,  and  one  for  each  client  programmer. 
The  library  manager  is  installed  on  a  host  server  that  is  accessible  to  all  team  members 
(access  can  even  be  physically  distributed  over  the  Internet,  though  performance  suffers). 
Each  programmer  works  with  a  Smalltalk  image  initially  provided  with  ENVY,  that  has 
the  library  management  subsystem  integrated  with  the  rest  of  the  Smalltalk  development 
system. 

Object-Oriented  Database  Management  System 

The  third  major  software  component  was  the  ODBMS.  While  the  research  project 
included  evaluation  and  development  with  both  GemStone  (GemStone  Systems  Inc.  1995) 
and  ObjectStore  (Object  Design  Inc.  1995),  I  will  limit  this  discussion  to  the  design  and 
implementation  of  OVPF  with  ObjectStore,  for  simplicity  and  clarity.  ObjectStore 
includes  both  a  server  module  and  a  client  module.  The  server  module  must  be  running  on 
the  host  computer  having  the  ODBMS  repository.  Each  client  programmer  then  works 
with  a  Smalltalk  image  which  has  been  customized  to  include  hooks  for  accessing  the 


ObjectStore  server  over  the  network,  much  like  ENVY.  All  of  these  layers  can  be 
envisioned  together  as  shown  in  Figure  4  and  Figure  5. 
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For  this  project,  we  used  the  Sun  Solaris  2.4  operating  system  on  a  Sparc  20 
workstation  for  the  Smalltalk  and  ODBMS  host  server.  This  was  connected  on  a  local 
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token-ring  network  to  other  workstations  which  could  serve  as  clients.  Each  of  the  major 
software  subsystsems  requires  considerable  processor  and  data  transfer  resources.  As 
noted  in  Figure  4,  it  is  recommended  to  have  each  of  the  major  subsystems  on  its  own  hard 
disk,  to  improve  overall  performance. 

Approaches  Used  in  Building  OVPF  Components 

The  remainder  of  the  Materials  and  Methods  section  presents  the  substantive 
aspects  of  building  the  components  for  OVPF.  This  was  a  very  large  undertaking,  and  is 
beyond  the  scope  of  this  thesis  to  describe  in  its  entirety.  Instead,  I  will  focus  attention  on 
the  portions  of  OVPF  which  have  the  most  bearing  on  the  goals  and  objectives  of  the 
research. 

Introducing  Some  Object-Oriented  Terms 

In  the  following  discussions,  it  may  be  helpful  to  be  acquainted  with  certain 
common  terms  used  in  describing  object-oriented  designs.  The  reader  is  referred  to  one  of 
the  references  on  Smalltalk  for  more  detailed  explanations  of  object-oriented  concepts 
(Goldberg  and  Robson  1989;  LaLonde  1994).  The  term  abstract  superclass  is  used  to 
represent  a  definitional  abstraction,  such  as  definition  of  variables  and/or  behavior  to  be 
shared  by  its  subclasses.  Instances  are  not  normally  created  from  abstract  classes. 
Concrete  classes,  on  the  other  hand,  are  those  that  are  expected  to  have  instances  made 
from  them.  These  terms  are  mainly  used  to  aid  in  learning  about  a  class  hierarchy;  to  call 
a  class  abstract  simply  implies  that  it  lacks  behavior  needed  for  creation  of  a  useful 
instance-object. 
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Instance  variables  are  data  structures  for  which  each  instance-object  has  its  own 
private  copy.  Instance  variable  definitions  in  one  class  are  inherited  as  part  of  the 
definition  by  all  of  its  subclasses.  In  Smalltalk,  instance  variable  names  begin  with  a 
lowercase  letter.  Class-instance  variables  are  data  structures  for  which  the  class  object  and 
each  of  its  subclasses  are  defined  to  have  a  private  copy  of  the  variable.  A  class-instance 
variable  can  be  used,  for  example,  to  hold  a  subclass-specific  default  value  for  a  constant 
that  can  be  accessed  with  the  same  name  from  any  class  in  the  hierarchy.  This  helps  reduce 
the  program's  "variable-name  vocabulary,"  which  is  one  of  the  benefits  of  object-oriented 
design. 

Class  variables  are  data  structures  for  which  the  defining  class  has  a  single  copy, 
that  can  be  directly  accessed  by  all  of  its  instances.  Per  Smalltalk  convention,  class 
variables  (and  other  shared  objects  including  classes)  have  names  that  begin  with  an 
uppercase  letter.  Class  variables  are  generally  used  either  to  hold  (1)  application-specific 
constants,  or  (2)  collections  of  specific  instances  of  a  class. 

In  an  object-oriented  system  all  actions  are  the  result  of  sending  a  message  to  an 
object.  The  receiver-object  then  responds  by  executing  a  method  by  that  name.  For 
improved  readability  in  this  thesis,  I  use  the  terms  message  and  method  interchangeably. 
However,  these  terms  have  distinct  meanings;  i.e.,  for  a  given  message  there  may  be  one 
or  more  methods  defined,  as  any  number  of  objects  can  have  a  method  with  the  same 
name. 
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Conversion  of  Source  Data  from  Vector  Product  Format  to  Smalltalk  Objects 

One  of  the  first  steps  that  was  required  for  OVPF  was  to  build  a  translator  in 
Smalltalk  capable  of  reading  Vector  Product  Format  (VPF)  data  files.  With  a  few 
exceptions,  these  are  well  specified  as  having  the  schema  information  (metadata)  for  a 
given  data  file  contained  in  the  header  of  that  file.  The  file  header  is  organized  in  three 
main  sections,  and  the  actual  geo-data  follows  immediately  after  these  sections.  This 
organization  is  described  in  (DMA  1993a,  section  3.6.1,  pp.  15-20).  To  summarize,  the 
first  header  section  consists  of  the  following  fields: 

1 .  Header  length:  4-byte  integer  representing  the  number  of  bytes  in  the  header. 

2.  Byte  order  flag:  'L'  for  least-significant  byte  first,  and  'M'  for  most- 
significant  byte  first.2  (Ironically,  this  flag  must  be  known  before  the 
preceeding  numeric  field  can  be  interpreted.) 

The  second  header  section  contains  only  one  field: 

3.  Table  description:  up  to  80  characters  of  textual  information. 

The  third  and  final  header  section  contains  the  actual  schema,  which  is  the  essential  part 
for  parsing  the  table's  data  content.  This  consists  of  repetitions  of  the  following  fields: 

4.  Column  name:  up  to  16  characters  of  textual  information. 


2.  Each  integer  and  floating-point  number  requires  2,  4  or  8  bytes  for  its  representation.  The 
byte  order  specifies  which  end  of  the  bytes  comes  first.  This  is  normally  determined  by  the 
operating  system.  PC  DOS,  for  example,  is  a  little-endian  platform  (least-significant  byte 
first),  while  Unix  and  Macintosh  are  big-endian  (most  significant  byte  first).  Since  VPF  data 
is  intended  to  be  read  on  any  of  these  platforms,  the  GIS  software  needs  to  be  written  to 
translate  VPF  integer  and  floating-point  numbers  appropriately  for  that  platform. 
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5.  Field  type:  a  single  character  defining  the  data  type  (one  of  those  listed  in  the 

first  column  of  Figure  6  below). 

6.  Number  of  elements:  an  integer  value  representing  either  (a)  the  number  of 
textual  characters,  or  (b)  the  number  of  occurences  of  the  specified  numeric 
field  type. 

7.  Key  type:  a  single  character  for  the  type  of  key  field  represented  by  the 
column  (one  of  P-primary,  U-unique,  or  N-none). 

8.  Column  description:  up  to  80  characters  of  textual  information. 

9.  Value  description  table:  up  to  12  characters  for  a  DOS-compatible  filename 
(either  INT.VDT  or  CHAR.VDT)  for  the  file  containing  textual  descriptions 
of  the  different  values  the  column  in  each  data  record  could  have.  This  will 
occur  when  the  column  is  a  nonspatial  attribute  of  a  given  geo-feature. 

By  knowing  the  schema  for  a  given  table,  the  program  can  loop  through  all  the  data 
records  in  that  table,  interpreting  each  field  (column  value)  according  to  the  schema.  An 
example  of  a  feature  table  with  header  and  records  is  shown  in  Figure  7. 

Because  most  of  the  VPF  database  tables  follow  this  schema  specification,  it  was 
straightforward  to  create  a  generalized  VPF  table  reader  procedure.  To  implement  this,  I 
created  two  main  reader  classes,  VPFTableHeader  and  VPFSchemaColumn  (see  Figure  8), 
as  well  as  a  hierarchy  of  classes  to  implement  the  specific  properties  and  behavior  of  the 

various  data  types  listed  in  Figure  6.  The  data  type  classes  were  used  to  translate  the  data 
values'  byte  representations  between  VPF  and  Smalltalk,  as  well  as  to  maintain  data 
integrity  (e.g.,  ensuring  that  text  field  values  did  not  exceed  their  schema-specified 
length).  There  was  an  added  dimension  of  translating  from  the  byte-order  of  the  VPF 
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source  data  (so  far,  this  has  always  been  little-endian)  to  that  used  by  the  operating  system 
platform  on  which  OVPF  was  running. 

Because  the  mechanics  of  reading  VPF  tables  are  very  straightforward 
computationally  once  their  format  is  known  and  an  object  structure  is  chosen,  I  will  not  go 
into  any  further  detail  on  this  particular  task.  It  should  suffice  to  say  that  Smalltalk  was 
capable  of  reading  and  interpreting  all  VPF  data  files,  including  those  which  did  not  have 
their  schema  in  the  header;  these  included  variable-length  index  files,  spatial  index  files, 
and  thematic  index  files  (see  DMA  1993a,  sections  5.4.1.3,  5.4.2  and  5.4.3,  pp.  77-83). 
The  Triplet  ID  field  was  particularly  troublesome,  as  it  is  a  variable-length  array  of  one  or 
more  integer  values,  whose  length  and  content  are  determined  by  decoding  the  bits  of  the 
first  byte  (DMA  1993a,  section  5.4.6,  p.  87).  Nevertheless,  all  these  are  handled  within  the 
OVPF  classes  just  mentioned. 

Representation  of  Metadata  Objects 

The  term  metadata  is  used  here  to  represent  the  parts  of  a  VPF  source  database  that 
define  the  actual  geo-feature  data.  There  is  a  substantial  amount  of  definitional  content  in 
a  given  VPF  database,  thanks  to  its  open  specification.  However  the  metadata  is  quite 
fragmented  among  numerous  files,  and  must  first  be  assembled  and  organized  in  some 
manner  before  it  is  possible  to  start  reading  the  actual  feature  data  with  it. 

The  approach  taken  with  OVPF  is  to  initialize  a  metadata  object  web  for  each  VPF 
database  to  be  accessed.  This  is  a  one-time  procedure  for  each  database,  after  which  the 
metadata  web  is  kept  in  the  ODBMS  repository  for  future  use  in  reading  VPF  source  data 
from  the  CD-ROM.  Figure  9  summarizes  the  steps  involved  in  processing  the  source 


Type 
Abbrv. 

Column  Type 

Length 
(Bytes) 

T,n 

Fixed-length  text 

n 

T,* 

Variable-length  text 

n  +  4 

F,l 

Short  floating  point 

4 

R.l 

Long  floating  point 

8 

S,l 

Short  integer 

2 

U 

Long  integer 

4 

C,n 

2-coordinate  array, 
short  floating  point 

8n 

c* 

2-coordinate  string 

8n  +  4 

B,n 

2-coordinate  array, 
long  floating  point 

16n 

B,* 

2-coordinate  string 

16n  +  4 

Z.n 

3-coordinate  array, 
short  floating  point 

12n 

Z,* 

3-coordinate  string 

12n  +  4 

Y,n 

3-coordinate  array, 
long  floating  point 

24n 

Y,* 

3-coordinate  array 

24n  +  4 

D,l 

Date  and  time 

20 

X,l 

Null  field 

(none) 

K,l 

Triplet  id 

1  -  13 

Figure  6.  Vector  Product  Format  Data  Types 
Source:  after  (DMA  1993a,  Table  56,  p.  86) 


50 


(Header  length  and  byte  order);\ 

ENVAREA.AFT.Environment  Area  Feature  Table;-;\ 

ID  =  l.l,RRow  ID,-,-,:\ 

F_CODE=T,5,N,FACC  Code,CHAR.VDT,-,:\ 

VAV=I,  1  ,N,Variation  Anomaly  Value,  INT.VDT,-,:; 

1 

ZC040 

2 

2 

ZC040 

1 

Figure  7.  Example  of  Feature  Table  with  Header  and  Records 
Source:  (DMA  1993a,  Table  6,  p.  20) 
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Figure  8.  VPFTableHeader  and  VPFSchemaColumn  Class 
Definitions  and  Example  Instance  Values 
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database  metadata  in  creating  a  Smalltalk  object  web  for  this  data.  Notice  that  each  of  the 
OVPF  metadata  classes  have  pointers  to  two  other  metadata  classes.  This  is  a  way  of 
representing  hierarchical  containment,  or  aggregation,  with  both  forward  and  backward 
pointers.  For  example,  the  libraries  instance  variable  of  VPFDatabase  holds  onto  a 
collection  of  VPFLibrary  instances,  each  of  which  holds  onto  a  "back-pointer"  to  its 
VPFDatabase  container.  With  this  structure  (starting  from  the  bottom  of  the  pointers  in 

Figure  9),  an  individual  VPFFeatureDef  instance  can  quickly  traverse  its  lineage  to  access 
coverage-,  library-,  and  database-level  metadata  as  needed. 

Only  a  couple  method  names  have  been  shown  in  Figure  9.  VPFDatabase  class  has 
a  set  of  methods  for  initializing  the  metadata  object  web  (represented  here  by  the  method 
"initializeVPFProductFrom:  pathname").  Interpretation  of  actual  feature  data  based  on  the 
metadata  has  been  made  the  responsibility  of  VPFCoverage,  as  this  corresponds  to  the 
level  at  which  features  and  graphical  primitives  are  linked  in  the  VPF  source  database. 

Representation  of  Geo-Feature  Objects 

As  should  now  be  all  too  apparent,  the  complete  set  of  data  representing  each  geo- 
feature  in  Vector  Product  Format's  file  structure  is  very  fragmented.  One  of  the  benefits  of 
the  object-oriented  approach  is  to  tie  the  pieces  together  with  object  pointers  instead  of 
join  tables,  for  more  direct  access  and  control.  The  greatest  single  cause  of  the 
fragmentation  within  a  given  coverage  is  the  need  to  represent  and  syncronize  both  spatial 
and  nonspatial  attributes  of  the  features.  This  subsection  describes  the  assembly  of 
nonspatial  aspects  of  each  feature,  while  the  next  subsection  describes  the  handling  of 
spatial  and  topological  attributes. 


Figure  9.  Steps  to  Create  Metadata  Web 

(Please  also  refer  to  Figure  2  on  page  14) 

1.  Process  database-level  files: 

-  create  VPFDatabase  instance; 

-  assign  rdbPath  variable  to  hold  the  directory  pathname 
for  the  source  database; 

-  store  the  VPFTableHeader  instances  created  for  reading  the 
Database  Header  Table  (DHT)  and  Library  Attributes  Table  (LAT) 
in  VPFDatabase  instance  variables. 

2.  Process  library-level  files: 

-  loop  through  Library  Attributes  Table  (LAT)  to  initialize  all 
VPFLibraries  for  this  database  (name,  bounds,  scales,  tile  names); 

-  store  the  VPFTableHeader  instances  created  for  reading  the 
Library  Header  Table  (LHT),  Geographic  Reference  Table  (GRT), 
and  Coverage  Attributes  Table  (CAT)  in  VPFLibrary  instance 
variables. 

3.  Process  coverage-level  files: 

-  loop  through  Coverage  Attributes  Table  (CAT)  to  initialize  all 
VPFCoverages  for  this  library  (name,  Value  Description  Table  (VDT) 
headers.  Feature  Index  Table  (FIT)  headers,  all  primitive  table 
headers,  and  feature-notes  headers). 

4.  Process  feature-level  files: 

-  loop  through  Feature  Class  Attributes  (FCA)  to  initialize  all 
VPFFeatureDefs  for  this  coverage  (name,  Feature  Table  (FT)  header, 
prim  header,  prim-join  headers,  Value  Description  Table  (VDT) 
entries  that  are  valid  for  this  featureDef,  coverage  and  library). 
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The  definitional  organization  of  feature  attributes  in  OVPF  is  depicted  in  Figure 

10.  The  VPFFeature  hierarchy  handles  the  nonspatial  aspects  of  features,  while  the 
VPFFeatureSymbol  class  hierarchy  handles  the  spatial  aspects.  The  featureDef  instance 
variable  defined  in  VPFFeature  class  provides  the  link  for  each  feature  instance  to  its 

complete  set  of  metadata  just  described  (see  Figure  9). 

The  methods  shown  in  Figure  10  are  a  small  subset  of  the  full  set  of  procedures 
implemented,  but  these  are  sufficient  for  the  present  discussion.  Notice  the 

ReadWriteStream  object  (Figure  10B)  which  is  used  to  represent  the  attributes  instance 
variable  of  each  geo-feature  object.  The  ReadWriteStream  is  an  important  object  that  is 
part  of  the  Smalltalk  system  class  library  and  is  used  to  represent  and  manage 
sequentially-accessed  data  collections,  much  like  one  might  think  of  accessing  data  on  a 
magnetic  tape.  The  contents  instance  variable  of  this  object  is  used  in  this  case  to  hold  all 
nonspatial  attribute  values  in  a  single  collection  of  bytes,  which  is  essentially  a  direct  copy 
of  the  geo-feature's  source  data  record  from  the  VPF  feature  table.  Two  simple  but 
important  methods  in  VPFFeature  are  "valueForAttribute:  aName"  and  "putValue:  aValue 
forAttribute:  aName."  These  methods  provide  a  generalized  means  of  accessing  and 
modifying  any  one  of  a  feature's  nonspatial  attributes  (such  as  F_CODE  or  VAV  from 
Figure  7  on  page  50).  Essentially,  these  methods  look  up  the  attribute's  data  type  and 
position  from  the  feature  table  schema  (held  in  the  ftHeader  instance  variable  of  the 
VPFFeatureDef  metadata  object),  then  perform  the  selected  action  on  those  bytes  in  the 
contents  of  the  ReadWriteStream  instance.  Other  methods  in  the  VPFFeature  class 
hierarchy  not  shown  here  include  means  of  maintaining  the  correct  feature-primitive 
linkages  during  changes  in  topology. 
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VPFFeatureSymbol  objects  hold  onto  the  actual  graphic  primitives  in  their 
graphicElements  instance  variable  (this  is  the  subject  of  the  next  topic). 
VPFFeatureSymbols  also  respond  to  display-related  requests  from  the  graphical  user 
interface. 

Representation  of  Graphical  Primitives  and  Topological  Relationships 

One  of  the  more  intriguing  issues  was  deciding  how  to  represent  spatial  topology 
(adjacency  and  contiguity  of  graphical  primitives)  within  the  Smalltalk  object-oriented 
data  model.  Each  feature  object  is  associated  with  a  set  of  latitude-longitude  coordinates, 
referred  to  as  graphical  primitives.  Point  features  are  associated  with  entity-  and 
connected-node  primitives.  Line  features  are  associated  with  edge  primitives.  An  area 
feature  is  associated  with  a  face  primitive  consisting  of  a  ring  of  edge  primitives,  and  text 
features  are  associated  with  text  primitives. 

Because  any  one  line  feature  object  may  consist  of  multiple  edges,  and  any  single 
node,  edge  or  face  primitive  could  be  used  by  more  than  one  feature  object,  great  care 
must  be  taken  to  maintain  the  correct  linkages  between  the  features  and  primitives. 
Topological  relationships  among  the  primitives  must  also  be  maintained  across  all  features 
within  a  given  coverage  and  tile,  according  to  the  VPF  specification. 

OVPF's  predecessor.  Objective  Facilities  Management  (OFM),  introduced  an 

object  known  as  a  DrawOrders.3  This  is  a  very  simple  structure  whose  inspiration  is 
drawn  from  Digitalk's  Smalltalk/V  for  OS/2  Presentation  Manager  (Digitalk  1989,  p.464). 


3.  The  DrawOrders  class  and  related  GraphicsEngine  were  initially  developed  by  Bob  Williams 
for  OFM. 


Figure  10.  Representation  of  Geo-Features  in  OVPF 

(A)  VPFFeature  abstract  class  hierarchy: 

-  VPFFeature  class  provides  shared  definition  and  methods  for 

accessing  and  modifying  attributes  and  defaultColor. 

-  VPFLineFeature  class  provides  shared  definition  of 

defaultLineType. 

-  These  classes  are  not  instantiated,  but  are  abstract  superclasses 

of  concrete  feature  classes. 

-  Each  subclass  has  its  own  private  copy  of  a  value  for 

defaultColor,  defaultLineType  and  defaultAreaPattern 
(where  defined). 

-  The  featureDef  instance  variable  holds  onto  an  object 

pointer  to  the  metadata  objects  for  each  feature  class. 

(B)  Instances  of  ReadWriteStream  class  are  used  to  hold 

nonspatial  attributes  of  each  instance  of  a  VPFFeature 
subclass;  ReadWriteStream  instances  understand  how 
to  read  ("next"  message),  write  ("nextPut"  message), 
and  reposition  themselves  ("reset"  message  and  others). 

(C)  VPFFeatureSymbol  class  hierarchy: 

-  These  classes  provide  shared  definition  and  methods  for 

accessing  and  modifying  the  graphical  primitives  for  a 
given  feature  (see  Figure  1 1  below). 

-  VPFFeatureSymbol  is  an  abstract  class  with  no  instances, 

but  instances  are  made  from  each  of  its  subclasses. 
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VPFFeature 


Class  Instance  Variables: 
defaultColor 


Instance  Variables: 
id 

featureDef 
attributes 
notes 
symbol 
Operations: 
valueForAttribute:  aName 
putValue:  aValue  forAttribute:  aName 


I 


VPFLineFeature 


Class  Instance  Variables: 
defaultLineType 


I 


VPFAreaFeature 


Class  Instance  Variables: 
defaultAreaPattern 


VPFPointFeature 


VPFTextFeature 


(A) 


ReadWriteStream 


Instance  Variables: 
contents 

position  


Operations: 
next:  anlnteger 
nextPut:  anObject 
nextPutAII:  aCollection 
reset 


(B) 


VPFFeatureSymbol 


Instance  Variables: 
feature 

graphicElements 

boundingBox 

color 

isHilighted  


Operations 
beErased 
beHilighted 
beUnhilighted 


I 


VPFPointFeatureSymbol 


I 


VPFLineFeatureSymbol 
Instance  Variables: 
lineType 


I 


VPFAreaFeatureSymbol 
Instance  Variables: 
areaPattern 


(Q 
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The  structure  contains  a  variable-length  array  of  bytes  (the  contents  attribute).  Each 
contents  array  has  the  following  implicit  organization: 

•  opcode  -  a  single  byte  whose  integer  value  (0  -  255)  represents  an  operation  code, 
such  as  set  polyline,  continue  line,  set  color,  etc. 

•  byte  length  -  a  single  byte  whose  integer  value  (0  -  255)  represents  the  number  of 
bytes  remaining  in  this  draw  order. 

•  data  bytes  --  the  bytes  whose  integer  or  floating-point  values  represent  the  location 
points,  the  line-color  index,  etc.  for  this  draw  order. 

The  contents  byte-arrays  from  several  DrawOrders  can  be  concatenated  into  a  single 
DrawOrders  instance,  to  include  an  arbitrary  number  of  instructions  for  displaying 
complex  graphical  objects.  This  structure  is  not  only  versatile,  it  is  very  compact  and 
efficient  for  representing  variable-length  locational  coordinate  data. 

Even  without  the  need  to  manage  spatial  topology,  DrawOrders  are  useful  objects 
for  handling  graphical  data  and  operations.  However,  supporting  VPF  graphical  primitives 
with  full  spatial  topology  requires  a  refinement  of  this  definition,  so  these  primitives  are 
implemented  by  subclassing  the  DrawOrders  class.  A  straightforward  example  would  be 
to  have  EntityNode,  ConnectedNode,  Edge,  Face,  and  Ring  classes  defined  as  direct 
subclasses  of  DrawOrders.  In  this  way,  each  subclass  would  inherit  the  DrawOrders 
contents  instance  variable,  and  add  its  own  specific  topological  attributes  as  needed.  It  is 
also  important  however,  for  each  graphical  object  to  hold  onto  a  collection  of  the  OVPF 
feature  objects  that  use  that  graphical  object.  This  is  handled  in  OVPF  by  defining  the 
TopologicalStructure  class  as  a  subclass  of  DrawOrders  and  as  a  superclass  of  each  VPF 
graphical  primitive  class  (see  Figure  1 1).  The  TopologicalStructure's  features  attribute  is 
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handled  as  a  collection  of  VPFFeatures  because  a  given  unique  graphical  primitive  may  be 
used  to  help  draw  any  number  of  VPF  geo-features.  Each  feature  object  holds  onto  an 
identity-pointer  to  its  corresponding  collection  of  graphical  primitive  objects,  thus 
enabling  both  features  and  primitives  to  have  access  to  each  other. 

In  addition,  the  primld  (primitive  ID)  and  tileld  attributes  of  TopologicalStructure 
are  inherited  by  each  subclass,  providing  a  holding  place  for  primary-key  data  from  the 
relational-VPF  files.  For  simplicity  of  supporting  both  import  and  export  operations  with 
the  relational-VPF  data,  the  primld,  tileid,  and  topological  attributes  are  assigned  the  VPF 
record  ID  value  of  the  corresponding  graphical  primitive  objects,  rather  than  unique 
object-identity  pointers.  As  features  are  added,  deleted,  and  moved  with  respect  to  each 
other,  these  primld  values  are  maintained  just  as  they  would  be  in  a  relational  GIS 
framework. 


VPFEntityNode 


containingFace 


VPFDrawOrder 


contents 


VPFTopologicalStructure 

features 

primld 

tileld 

VPFConnectedNode 


firstEdge 


VPFEdge 


startNode,  endNode, 
leftEdge,  rightEdge, 
leftFace,  rightFace 


VPFTextPrim 


text 

shapeLine 


Legend: 


Superclass 


Instance  Variables 


Subclass 


Instance  Variables 


VPFFace 


ringPtr 


VPFRing 


firstEdge 


Figure  1 1 .  Object  Definitional  Hierarchy  for  Representing 
VPF  Graphical  Primitives  with  Spatial  Topology 
Source:  after  (Arctur  et  al.  1995b,  p.  14) 
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Presently,  all  source  data  comes  to  OVPF  from  relational-VPF  databases.  At  this 
stage  in  the  prototype  development,  we  have  assumed  that  all  feature  attribute,  location, 
and  topological  relationships  in  the  source  data  are  initially  correct.  Thus  we  can  focus  our 
attention  on  developing  full  build  and  clean  topological  support  (ESRI  1994)  in  a  step- 
wise manner,  beginning  with  simply  maintaining  topology  locally  during  individual 
feature  changes.  We  now  have  the  capability  to  interactively  add,  delete,  and  change 
location  coordinates  of  a  single  point,  line  or  area  feature  at  a  time  within  a  given  tile, 
while  maintaining  correct  topological  relationships  with  adjacent  and  contiguous  features 
(Chung  et  al.  1995).  This  is  handled  with  the  help  of  a  graphical  user  interface  (GUI)  that 
requires  the  user  to  accept  and  commit  changes  to  each  topological  relationship. 

Design  of  an  Object-Oriented  Spatial  Index 

The  spatial  index  framework  in  OVPF  is  implemented  with  just  two  main  classes 
(this  is  a  slight  simplification  for  purposes  of  discussion).  These  classes  are  the 

VPFSpatialDataManager  and  VPFSpatialDataCell,  shown  in  Figure  12.  This  framework 

presently  uses  a  quadtree  organization  (after  Samet  19944),  in  which  each  level  of  the  tree 
represents  a  rectangular  geographic  area  and  can  be  divided  into  four  equally-sized 
quadrants.  Each  quadrant  in  turn  can  be  subdivided,  continuing  recursively  until  some 
predetermined  limit  is  met.  Within  this  structure,  each  geo-feature  is  inserted  into  the 
smallest  quadtree  cell  which  can  completely  contain  it.  This  cell  is  the  feature's  spatial 


4.  The  quadtree  structure  used  in  OVPF  is  simplified  form  of  a  spatial  index  structure  originally 
implemented  by  Bob  Williams  for  OFM. 
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index.  Insertion  and  queries  start  from  the  root  (topCell)  and  progress  recursively,  until 
one  of  two  conditions  is  met:  (1)  the  smallest  cell  has  been  found,  or  (2)  the  maximum 
number  of  levels  allowed  has  been  reached.  A  maximum  depth  is  necessary  to  limit  the 
extent  of  recursion  for  very  small  geo-features  such  as  points.  In  OVPF  we  have  used  a 
maximum  level  of  20.  Note  that  features  are  indexed;  not  graphical  primitives.  This  is  to 
reduce  the  complexity  and  computational  overhead  of  insertion  and  retrieval. 

This  design  has  some  very  interesting  implications  and  potentials,  which  are 
brought  out  in  the  final  Discussion  section.  One  point  to  mention  now  however,  is  that 
because  each  VPFSpatialDataCell  holds  onto  direct  object  pointers  to  the  features  which  fit 
within  its  boundaries,  the  quadtree  is  more  than  just  an  index;  it  is  an  efficient,  general 
purpose  container  structure  for  all  the  geo-features.  This  proves  useful  in  the  design  of  the 
ODBMS  repository,  which  is  the  next  topic. 

Organization  of  Object  Webs  in  ODBMS  Repository 

There  are  two  main  groupings  of  database  objects  in  OVPF:  the  metadata  objects 
(table  headers,  schema  definitions,  value  descriptions,  and  others);  and  the  geo-feature 
objects  and  primitives.  Each  of  these  object  groups  needs  to  be  stored  in  the  ODBMS 
repository,  that  is,  "made  persistent."  Another  category  of  OVPF  objects  includes  the  user 
interface  classes.  These  are  the  support  classes  which  present  the  map  on  the  computer 
screen  and  allow  interaction  with  the  user.  It  is  very  important  that  the  user  interface 
classes  are  not  made  persistent,  for  reasons  that  will  be  presented  shortly. 

Typically,  a  complete  web  of  objects  is  made  persistent  by  reference  to  some  root 
or  parent  object  for  the  set  during  the  course  of  a  database  transaction.  This  root  object 


Figure  12.  Principal  Classes  and  Behavior  for  Quadtree  Spatial  Index 


(A)  Spatial DataManager  is  a  subclass  of  Object,  and 
understands  how  to: 

-  create  and  initialize  a  quadtree; 

-  pass  a  geo-feature  to  the  quadtree  for  insertion; 

-  ask  the  quadtree  to  remove  a  given  geo-feature;  and 

-  ask  the  quadtree  for  all  features  within  a  given  area. 

(B)  SpatialDataCell  is  a  subclass  of  Array,  having  four 
indexed  slots  in  addition  to  the  named  instance  variables. 
These  indexed  slots  each  hold  onto  an  object  pointer  to 
another  instance  of  VPFSpatialDataCell. 

Each  cell  understands  how  to: 

-  determine  if  it  is  the  smallest  cell  capable  of  containing  the 

rectangular  area  requested; 

-  propogate  the  request  for  the  smallest  cell  recursively 

to  the  next  lower-level  cell; 

-  propogate  the  request  back  up  one  level  if  it  cannot  hold 

the  requested  rectangle;  and 

-  gather  and  return  pointers  to  all  features  contained  within 

a  given  rectangle,  regardless  of  the  number  of  levels 
involved. 


 VPFSpatialDataManager  

Instance  Variables: 

coverage 

topCell 

maxLevel   

Operations: 
initializeMin:  minPt  max:  maxPt 
collectionOfContainersFor:  aRectangle 
containerFor:  aRectangle 
returnSetOflntersectingFeatures:  aReaangle 


(A) 


(Array) 

 VPFSpatialDataCell  

Instance  Variables: 
id 

superCell 

level 

manager 

origin 

corner 

width 

features 

(four  array  slots  for  subcells)   

Operations: 
canContainBoundingBox:  aRectangle 

containedLowerLevelContainerForBoundingBox:  aRectangle 
containerForBoundingBox:  aRectangle 
createLowerLevelCellForlndex:  index 
lowerLevelContainerForBoundingBox:  aReaangle 
upperLevelContainerForBoundingBox:  aReaangle 


(B) 
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can  provide  a  named  entry  point  to  the  persistent  object  web  for  future  access  by  other 
application  programs.  In  the  case  of  the  metadata  object  web,  the  root  is  a  collection  object 
of  all  initialized  databases,  keyed  by  their  database  name.  For  example,  this  collection  has 
a  member  called  'DNCO I '  which  points  to  the  persistent  database  for  Norfolk  Harbor. 

For  the  feature  objects,  one  logical  root  object  is  the  spatial  tree  manager,  which 
holds  a  pointer  to  the  linked  list  of  spatial  tree  cells,  each  of  which  holds  pointers  to  the 
features  whose  bounding  rectangle  falls  within  the  cells'  boundaries.  Each  feature  object 
(instance  of  a  VPFFeature  subclass)  holds  onto  its  attributes  stream  and  its  symbol 
(instance  of  a  VPFFeatureSymbol  subclass).  Since  each  coverage  has  its  own  spatial  index, 
the  spatiallndex  instance  variable  of  VPFCoverage  was  defined  to  hold  the  persistent 
pointer  to  the  VPFSpatialDataManager  instance  in  charge  of  the  coverage's  quadtree  (see 

Figure  9  on  page  52). 

Another  logical  root  object  for  feature  objects  is  the  instance  of  VPFFeatureDef 
which  defines  a  given  feature  class.  Providing  access  to  features  via  their  VPFFeatureDef 
instance  would  be  useful  in  certain  query  optimizations.  The  features  instance  variable 
was  thus  defined  for  VPFFeatureDef  class,  to  hold  a  second  set  of  direct  pointers  to  the 

persistent  feature  objects  (also  shown  in  Figure  9). 
Establishing  cut-points  in  object  webs 

Normally,  a  request  to  make  an  object  persistent  results  in  migrating  the  complete 

transitive  closure5  of  all  objects  to  which  the  requested  object  points,  into  the  external 
database.  A  case  where  this  is  not  desireable  is  where  links  to  the  user  interface  are  held  by 
persistent  objects.  One  reason  a  user  interface  object  should  not  be  made  persistent  is  that 
it  contains  numerous  references  to  transient  objects  that  can  only  be  assigned  and  changed 
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by  the  host  operating  system,  such  as  window  handles,  file  handles,  and  so  on.  The  other 
main  reason  a  user  interface  object  should  not  be  made  persistent  is  that  it  touches  so 
much  of  the  Smalltalk  run-time  environment  objects  that  it  would  essentially  pull  the 
entire  Smalltalk  memory  image  into  the  external  database  with  it. 

ObjectStore  provides  means  of  resolving  this  issue  with  the  notion  of  cut-points. 
By  adding  a  particular  method  to  each  of  the  user  interface  classes,  the  transitive  closure 
operation  can  be  made  to  insert  a  cut-object  in  place  of  the  reference  to  the  user  interface 
object  itself.  This  cut-object  reference  is  then  replaced  at  run-time  by  the  "live"  object 
reference  when  needed. 

Design  of  a  Rule-Base  Framework  to  Support  Geographic  Feature  Editing 

The  rule-based  framework  was  added  to  OVPF  in  the  second  year  of  the  project  as 
a  means  to  help  enforce  data  integrity  constraints  on  features  during  interactive  updates. 
Rules  in  this  framework  can  be  defined  to  "fire"  upon  occurrence  of  a  particular  event, 
subject  to  arbitrary  conditions  anywhere  in  the  database.  Should  one  of  the  rules  be 
triggered  and  its  associated  conditions  hold  true,  then  a  predefined  action  would  be  carried 
out.  The  following  discussion  shows  how  this  is  implemented  in  OVPF. 


5.  Transitive  closure  is  a  term  from  graph  theory,  denoting  the  set  of  all  pairs  of  nodes  directly 
or  indirectly  connected  by  a  sequence  of  edges.  In  the  case  of  object  webs,  it  refers  to  all 
objects  connected  by  association  or  containment  from  a  given  root  object  (after  Rumbaugh 
etal.  1991,  p.  57) 
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Event  Objects 


Events  are  first-class  objects  in  this  framework  as  they  have  significant  state  and 

behavior  (Arctur  et  al.  1995d).  The  PrimitiveEvent  class  in  Figure  13  defines  an 
"eventMsg"  attribute  which  is  inherited  by  all  its  subclasses.  For  each  new  instance  of  any 
event,  this  attribute  is  assigned  the  name  of  the  message  for  which  the  event  is  raised.  The 
ComplexEvent  class  defines  further  attributes  used  by  its  own  subclasses. 


PrimitiveEvent 


Instance  Variable: 
eventMsg  


Operations: 
notify: 


ConjunctionEvent 


Operations: 
notify: 


ComplexEvent 


Instance  Variables: 
event  I 
event2 

event  I  Occurred 
event20ccurred 


I 


DisjunctionEvent 


Operations: 
notify: 


Legend 

Superclass 

attribute 

method 

i 

direction  of 
inheritance 

Subclass 

SequenceEvent 


Operations: 
notify: 


Figure  13.  Event  Class  Hierarchy 


The  key  method  for  each  of  the  event  classes  is  notify:.  This  method  takes  only  one 
argument  which  specifies  the  name  of  the  message  which  causes  an  event  to  be  raised.  The 
event  is  raised  when  the  object(s)  associated  with  the  event  object  receives  that  message. 
For  PrimitiveEvents  the  notify:  method  simply  compares  the  argument  to  its  own  eventMsg 
attribute  value  and  returns  true  if  they  match.  For  each  ComplexEvent  subclass,  the  notify: 
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method  also  examines  a  particular  combination  of  the  status  of  its  other  attributes,  before 
returning  true  or  false.  An  event  instance  is  typically  created  at  the  time  of  rule  creation. 

Rule  Objects 

VPFRule  objects  have  the  structure  shown  in  Figure  14.  A  single  class  suffices  for 
defining  all  rules.  The  feature  instance  variable  may  be  assigned  a  pointer  to  either  a 
single  geographic-feature  instance,  such  as  a  road  or  lake;  or  to  a  feature  class,such  as  the 
defining  class  for  roads  or  lakes.  In  the  former,  the  rule  will  be  applied  only  to  a  particular 
instance  whereas  in  the  latter,  the  rule  will  be  applied  to  all  instances  of  the  defining  class. 


The  event  instance  variable  is  assigned  a  pointer  to  a  specific  event  instance  (introduced 
above),  which  could  be  either  a  Primitive  Event  or  a  ComplexEvent.  The  condition 
attribute  is  assigned  the  name  of  a  method  to  be  executed  at  the  time  the  event  is  signalled, 
which  will  return  true  if  the  condition  is  met  and  false  otherwise.  The  action  method  is 
then  executed  if  the  condition  evaluates  to  true.  The  preOrPost  attribute  specifies  the 
relative  timing  for  execution  of  the  condition  method  with  respect  to  the  message  raising 


VPFRule 


Description  of  Rule  attributes: 

feature:  pointer  to  a  geographic-feature  class  or  feature 
instance 

event:  pointer  to  an  instance  of  PrimitiveEvent  or  one  of  its 

subclasses 
condition:  condition-test  method  name 
action:  action  method  name 
action  Priority:  integer  value  I  (low)  to  100  (high) 
preOrPost:  flag  specifying  if  condition  is  tested 

before  or  after  the  message  raising  the  event 


Instance  Variables: 
feature 
event 
condition 
action 

actionPriority 
preOrPost 


Figure  14.  Structure  of  a  Rule  Object 
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the  event.  The  condition  may  be  evaluated  either  before  the  event  message  is  executed,  or 
upon  completion  and  return  from  the  event  message  execution.  The  actionPriority  attribute 
value  is  used  to  help  mediate  in  situations  where  multiple  rules  fire  at  the  same  time. 

Event  Detection  Mechanism 

The  final  component  of  this  framework  is  the  mechanism  by  which  events  are 
detected  and  rules  are  fired.  In  the  OVPF  viewer/editor  tool,  all  changes  to  geo-feature 
objects  are  handled  through  the  use  of  FeatureConstructor  objects,  which  use  a  script- 
based  framework  with  a  state  machine,  supporting  asynchronous  events  for  flexibility  in 

working  with  runtime-dependent  constraints  on  changes  to  a  given  feature.6  This 

framework  has  the  potential  for  extending  its  own  semantics  at  runtime.  See  Figure  15  for 
a  simplified  representation  of  the  VPFFeatureConstructor  hierarchy. 

With  this  framework,  a  user  request  to  create  or  modify  a  geo-feature  via  the  GUI 
is  forwarded  to  the  appropriate  PointFeatureConstructor,  LineFeatureConstructor  or 
AreaFeatureConstructor.  The  constructor  is  given  the  name  of  the  geo-feature  class, 
which  it  instantiates  with  default  values  for  all  attributes.  In  the  case  of  creating  a  new 
geo-feature  object,  the  constructor  then  prompts  the  user  for  the  feature's  location.  At  this 
point,  the  constructor  notifies  the  new  feature  object  of  the  intended  action.  This 
notification  results  in  a  lookup  to  the  feature's  rule  base.  Any  rules  having  events  defined 
for  the  current  operation  will  have  the  opportunity  to  check  for  any  particular  conditions  in 


6.  The  FeatureConstructor  framework  was  first  developed  by  Bob  Williams  for  OFM.  Very 
minor  changes  were  needed  to  accommodate  the  rule-based  capability. 
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VPFFeatureC  constructor 


Instance  Variables: 
feature 
nextAction 
point  


Operations: 
stopCreateFeature: 


i_  I  _1 


VPFPointFeatureConstructor 

VPFAreaFeatureConstructor 

Operations: 
point  1 : 

Operations: 
points: 

VPFLineFeatureConstructor 

Operations: 
point  1 : 
point2: 

Figure  15.  VPFFeatureConstructor  Hierarchy 


the  database  that  are  of  interest.  After  all  rules'  conditions  have  been  checked,  those  which 
evaluated  true  are  sorted  in  priority  order,  and  their  respective  actions  are  performed.  An 
example  of  usage  is  provided  in  the  Results  section  following. 


RESULTS 

This  section  presents  a  summary  of  findings  from  this  research  and  development. 
This  is  in  two  parts:  the  first  part  shows  examples  of  general  usage,  and  the  second  part 
describes  the  operation  of  the  rule-based  framework. 

OVPF  Application  Overview 

The  diagram  in  Figure  16  shows  functional  relationships  among  the  principal 
modules  of  the  OVPF  application.  The  two  main  points  of  control  in  this  figure  are  the 
graphical  user  interface  (GUI)  and  the  metadata  framework.  The  GUI  provides  the  user 
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Figure  16.  Principal  OVPF  Components 
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the  menus  and  programming  access  with  which  to  direct  the  operation  of  OVPF.  The 
metadata  framework  carries  out  the  bulk  of  the  processing  of  VPF  source  data  for 
migration  to  the  internal  object  model  and  to  the  ODBMS.  The  metadata  model  is  also 
responsible  for  exporting  edited  OVPF  data  back  to  the  relational  VPF  file  structure. 

Transformation  of  Relational  Vector  Product  Format  Data  to  an  Object  Web 

As  described  in  the  preceeding  Materials  and  Methods  section,  the  import  of 
source  data  into  the  OVPF  application  at  runtime  is  accomplished  in  stages.  First,  the 
metadata  object  web  for  a  given  VPF  database  is  initialized,  after  which  the  geo-feature 
data  can  be  interpreted  and  displayed  on  the  screen.  As  the  geo-feature  data  is  read  by 

OVPF,  it  is  inserted  into  the  quadtree  spatial  index  structure.  Figure  17  shows  all  the 
significant  definitional  relationships  among  the  metadata,  the  feature  objects,  and  the 

spatial  index  structure.  Figure  18  shows  the  dynamic  associations  among  the  runtime 
instances  of  metadata,  feature,  and  spatial  index  objects. 

Displaying  Spatial  Features 

An  overview  of  the  main  steps  in  reading,  indexing  and  displaying  a  Vector 
Product  Format  map  (from  either  the  relational  source  files  or  from  the  ODBMS)  is 

depicted  in  Figure  19  below.  Figure  20  shows  a  "screen  capture"  of  the  OVPF  map 
window  display  with  a  portion  of  the  Norfolk  Approach  library. 
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Figure  19.  Transfer  of  Spatial  Features  from  VPF  to  OVPF 
Geo-features  are 

(1)  imported  either  from  relational  VPF  files  or  from  ODBMS,  then 

(2)  placed  in  quadtree,  and 

(3)  rendered  on  screen. 

Note  that  the  ODBMS  contains  whole  features,  while  4  or  more 
georelational  files  are  required  to  define  each  individual  feature. 

Source:  after  (Arctur  et  al.  1995c) 
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Figure  20.  OVPF  Map  Display  of  Multiple  Coverages 
in  Norfolk  Approach  Library  of  DNC01 


Migrating  Object  Webs  to  ODBMS 

As  mentioned  in  the  Methods  section,  not  all  of  the  OVPF  data  should  be  placed  in 
the  ODBMS  repository.  In  particular,  the  GUI  objects  should  not  be  allowed  to  migrate  to 
the  persistent  data  store,  as  this  would  inevitably  result  in  migrating  most  of  the  Smalltalk 

development  environment  through  the  transitive  closure  from  the  GUI  root  objects.  Figure 

21  shows  the  relationships  among  the  main  groupings  of  objects  in  OVPF,  and  which  are 
managed  by  the  ODBMS. 
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OVPF  User  Interface  -  instances  of: 


VPFMapWindow  subclasses 
VPFMapPane  subclasses 
VPFFeatureEditor  subclasses 
VPFGraphicsEngine 


(A) 


Metadata  -  instances  of: 


-  VPFDatabase 

-  VPFLibrary 

-  VPFCoverage 

-  VPFFeatureDef 

-  VPFTableHeader 


Spatial  Tree  -  instances  of: 

-  VPFSpatialDataManager 

-  VPFSpatialDataCell 


I 


Feature  Data  -  instances  of: 

-  VPFFeature  subclasses 

-  VPFFeatureSymbol  subclasses 


I 


Graphic  Primitives  -  instances  of: 
-  VPFDrawOrders  subclasses 


_________ 


(B) 


Figure  2 1 .  Persistency  and  Linkages  of  Principal  OVPF  Components 

(A)  Non-persistent  objects 

(B)  Persistent  objects 

Source:  after  (Arcturetal.  1995a;  Cobbetal.  1995a) 


Applying  the  Rule-Base  Framework  for  Feature  Editing 


In  order  to  demonstrate  the  rule-based  framework,  an  example  rule  to  prevent  any 
BuildingPoint  geographic  features  from  being  placed  over  water  was  implemented  and 
tested.  Sample  VPFRule  and  VPFPrimitiveEvent  object  structures  are  shown  in  Figure  22. 
In  this  case,  the  VPFRule  instance  is  associated  with  the  BuildingPoint  class  and  thus  will 
be  applied  to  all  instances  of  that  class.  Alternatively,  the  user  may  associate  the  rule  with 
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a  particular  BuildingPoint  instance.  Due  to  the  setting  of  the  preOrPost  attribute,  the 
condition  method  onWater:  is  evaluated  before  the  Event's  eventMsg  (the  newPoint: 
method)  is  carried  out.  If  the  condition  method  onWater:  returns  true,  the  action  method 
stopCreateFeature  will  then  be  executed,  which  will  prevent  the  eventMsg  method 
newPoint:  from  being  performed.  The  action  Priority  setting  ensures  this  action  will  have 
highest  priority  among  any  other  VPFRules  which  may  also  fire. 


aVPFRule 


feature 
event 
condition 
action  — 
actionPriority 
preOrPost 


DNCBuildingPoint  class 


'('onWater:') 


a  PrimitiveEvent 


C-1*"  ('stopCreateFeature'  J 


eventMsg 


('newPoint:'  J 


f    object  \ 

Legend 

V    attribute  I 

association  pointer 

~ikk(  value  object  ) 

Figure  22.  Example  Rule  and  Event  Objects 
Source:  (Arctur  et  al.  1995d) 

At  this  point  we  need  to  introduce  the  rest  of  the  framework  in  which  Events  are 
detected  and  Rules  are  fired.  In  the  OVPF  viewer/editor  tool,  all  changes  to  geographic- 
feature  objects  are  handled  through  the  use  of  FeatureConstructor  objects  (see  Figure  23). 
With  reference  to  our  example  for  creating  a  new  BuildingPoint  feature,  we  assume  a  Rule- 
Event  pair  has  already  been  created  (for  checking  if  a  new  point  feature  is  over  water)  and 
stored  in  the  DNCBuildingPoint's  rules  dictionary  (class  instance  variable  defined  in 


Ol 

VPFFeature  class.  Figure  23B).  This  rule  base  is  actually  stored  physically  in  the 


ODBMS. 


VPFFeatureConstructor 


Instance  Variables: 
feature 
nextAction 
point  


Operations: 
onWater: 

stopCr  eateFeature 


VPFAreaFeatureConstructor 


VPFLineFeatureConstructor 


VPFPointFeatureConstructor 


Operations: 
point  I : 


(A) 


VPFFeature 


Class  Instance  Variables: 
rules 


Operations: 
notify:argList:preOrPost:from:newPoint: 


DNCBuildingPoint 


(B) 


Figure  23.  Key  Components  of  Event  Detection  Framework 

(A)  Partial  FeatureConstructor  Class  Hierarchy 

(B)  Partial  Feature  Class  Hierarchy 
Source:  after  (Arctur  et  al.  1995d) 

The  following  sequenc  e  of  events  could  then  take  place  at  the  user's  initiation  (step 

numbers  correspond  to  those  in  Figure  24): 

I .     The  user  chooses  the  appropriate  OVPF  menu  option  to  add  a  new 

geographic  feature,  and  selects  BuildingPoint  from  a  list  of  available  feature 


classes. 


The  OVPF  graphical  user  interface  (GUI)  creates  a  PointFeatureConstructor. 
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Action  Summary 


Direction  of  Messages 


1 .  User  chooses  menu  option  to  add  a  new  > 
feature  J 

2.  GUI  creates  a  constructor  for  the  new  > 
feature  object  J 

3.  Constructor  creates  a  default  instance  of 
BuildingPoint,  and  requests  coordinate  point 
from  GUI    J 

4.  GUI  returns  user-defined  location  ^ 
coordinates  for  new  feature  J 

5.  Constructor  sends  message  — 
feature  notify:  'newPoint:' 

argList:  (point) 
preOrPost:  'pre' 

from:  self  / 

6.  Feature  scans  rule  base  for  rules  with  ^ 
event  message  'newPoint:'  , 

^7.  Feature  finds  rule  and  evaluates  condition  ^ 

message  — 
constructor  perform:  'onWater:'  ; 

constructor  then  queries  ODBMS  and 
^returns  true  or  false  J 

8.  If  condition  evaluates  true,  feature  sends  ^ 
message  — 

construaor  perform:  stopCreateFeature 

9.  If  constructor  has  to  stopCreateFeature,  ther^ 
constructor  assigns  'stop'  value  to  its  nextAction 
attribute   > 

10.  If  constructor's  nextAction  is  'stop'  it 
discards  the  new  feature  . 

^\  I .  If  constructor's  nextAction  was  not  'stop'^ 
then  it  sends  the  message  — 
feature  newPoint:  point 
and  finally  inserts  the  new  feature  in  the 
quadtree.  . 


User 


I 


OVPF  GUI 


I 


PointFeatureConstructor 


I 


OVPF  GUI 


I 


PointFeatureConstructor 


I 


BuildingPoint 


I 


Rules 


lf7,  8 


PointFeatureConstructor 


I 


ODBMS 


10,  11 


BuildingPoint 


t  11 


Spatial  Quadtree 


Figure  24.  Flow  of  Control  and  Behavior  For  Rule-Event  Example 
Source:  after  (Arctur  et  al.  1995d) 
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3.  The  Constructor  creates  a  default  BuildingPoint  feature  object,  and  initiates  a 
request  to  the  GUI  for  a  user-selected  location  coordinate  point,  to  be 
returned  via  the  point  I :  message. 

4.  On  instruction  from  the  GUI,  the  user  chooses  a  location  on  the  map  with  the 
mouse,  and  the  GUI  returns  it  as  the  argument  in  the  point  I :  message  to  the 
Constructor. 

5.  Within  its  point  I :  method,  the  Constructor  notifies  the  new  BuildingPoint 
feature  instance  of  an  impending  Event  via  the  parameterized  notify:argl_ist: 
preOrPost:from:  message. 

6.  The  new  BuildingPoint  object  executes  the  inherited  notifyrargList: 
preOrPost:from:  method,  which  checks  the  rule  base  for  all  Rule-Event  pairs 
whose  eventMsg  matches  the  notify:  argument,  in  this  case  newPoint:. 

7.  If  a  matching  Rule-Event  pair  is  found,  then  the  Rule's  condition  value 
(onWater:)  is  sent  as  a  message  to  the  Constructor  to  perform.  The 
Constructor's  onWater:  method  checks  the  database  for  any  water-related 
features  within  a  given  tolerance  of  the  user-selected  coordinates,  and  returns 
true  or  false.  By  user's  preference,  this  check  can  be  performed  either  on  just 
the  features  currently  being  displayed,  or  on  features  from  all  coverages  in 
the  ODBMS. 

8.  If  the  onWater:  method  returns  true  (coincident  water  feature  was  found),  the 
Rule's  action  message  is  then  sent  to  the  Constructor.  In  this  case  if  water 
features  were  found,  the  message  stopCreateFeature  would  be  the  action 
message  sent  to  the  Constructor.  Note  that  in  the  present  framework,  all 
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applicable  conditions  are  evaluated  before  any  actions  are  performed.  If 
multiple  conditions  return  true,  their  action  messages  are  sent  to  the 
PointFeatureConstructor  in  order  of  decreasing  actionPriority. 

9.  If  the  Constructor  receives  the  message  stopCreateFeature,  it  will  set  its 
nextAction  attribute  to  'stop'. 

10.  Upon  completion  of  all  applicable  conditions  and  actions,  the  new 
BuildingPoint  object  returns  from  executing  the 

notify:argl_ist:preOrPost:from:  method.  The  thread  of  control  reverts  to  the 
Constructor's  point  I :  method,  which  then  checks  its  nextAction  setting.  If  it 
is  'stop'  then  the  new  default  BuildingPoint  feature  is  discarded,  and  control 
returns  to  the  user  with  a  descriptive  dialog  message. 

11.  If  the  nextAction  is  not  'stop'  then  the  Constructor  sends  the  newPoint: 
message  to  the  new  BuildingPoint,  inserts  it  in  the  spatial  quadtree,  and 
presents  the  user  with  a  dialog  window  to  fill  in  any  BuildingPoint  feature 
attributes  needed. 

This  simple  example  can  easily  be  extended  to  encompass  multiple  rules  for  a 
given  geographic  feature,  as  well  as  to  handle  multiple  features.  In  addition  to  the  "either- 
or"  situation  represented  in  this  example,  a  rule  could  be  based  on  prerequisite  and 
corequisite  existence  of  other  features,  even  occurring  in  a  particular  temporal  sequence  or 
logical  combination.  It  is  simply  necessary  for  the  FeatureConstructor  class  or  one  of  its 
subclasses  to  mediate  all  requests  for  changes  or  additions  to  geographic  features  by  the 
user,  and  for  the  FeatureConstructor  method  invoked  to  check  the  affected  feature's  rule 
base. 


DISCUSSION 


A  number  of  implications  can  be  found  from  this  research.  In  the  following 
sections,  I  will  first  address  this  work  in  terms  of  its  initial  objectives,  followed  by  a 
discussion  of  the  limitations  so  far  recognized  in  the  technologies  and  designs  used.  A 
look  at  future  directions  and  summary  conclude  this  thesis. 

Implications  of  Research  for  Meeting  Initial  Objectives 

The  objectives  stated  on  page  8  include  supporting  (1)  complex  interdependencies 
among  geographic  features,  (2)  very  large  databases,  and  (3)  the  potential  for  expert 
system  applications.  Each  of  these  will  be  discussed  in  turn. 

Supporting  Complex  Interdependencies  Among  Geographic  Features 

The  descriptions  of  Vector  Product  Format  file  structures  for  representing  geo- 
feature  data,  and  the  object  webs  created  in  OVPF  to  capture  this  information,  show  that 
this  Smalltalk-based,  object-oriented  data  model  is  very  versatile  and  expressive.  In 
related  development  work  for  the  Naval  Research  Laboratory,  this  framework  has  been 
adapted  to  import  and  display  geographic  data  from  four  different  kinds  of  VPF  product 
databases  simultaneously  (Digital  Nautical  Chart,  World  Vector  Shoreline,  Vector  Smart 
Map,  and  Urban  Vector  Smart  Map). 
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In  addition  to  the  extensibility  of  the  metadata  structure  facilitated  by  the  object- 
oriented  class  hierarchy,  the  rule-based  framework  described  here  provides  another  level 
of  extensibility.  Because  the  Smalltalk  language  supports  incremental  dynamic 
compilation,  rule-based  actions  could  actually  trigger  the  creation  of  additional  classes 
and  methods  at  runtime,  according  to  the  needs  of  the  application.  This  might  be  done,  for 
example,  to  augment  the  behavior  of  existing  geographic  feature  objects  to  respond  to  new 
conditions  in  their  environment  that  might  only  affect  some  of  the  features  and  not  others. 
It  is  also  conceivable  that  a  given  geographic  feature  might  evolve  into  a  different  kind  of 
feature  over  time;  the  framework  described  here  could  support  such  an  evolution. 

This  rule-based  approach  can  be  used  in  three  distinct  situations:  (1)  immediate 
mode,  to  execute  rules  immediately  before  or  after  some  state  change;  (2)  deferred  mode, 
to  execute  rules  at  the  end  of  several  changes;  and  (3)  detached  mode,  to  perform  rule- 
based  actions  separately  from  the  state  changes.  Furthermore,  it  has  the  advantage  over 
traditional  inference-engine  approaches  in  that  it  will  work  with  an  arbitrarily-large 
database  of  persistent  objects,  rather  than  being  limited  to  those  objects  which  can  fit  in 
memory.  This  approach  should  support  the  types  of  complex  interdependencies  commonly 
found  in  facilities  management  applications,  such  as  with  public  utilities  networks. 

The  rule-event  framework  and  procedures  were  surprisingly  simple  to  implement. 
The  FeatureConstructor  classes,  together  with  a  single  supporting  method  in  VPFFeature 
class  (notify:argList:preOrPost:from:),  provide  a  simple  and  flexible  event  detection  and 
rule  processing  system.  While  it  introduces  some  processing  overhead,  all  but  the  spatial 

query  to  the  ODBMS  (see  step  7,  Figure  24  on  page  82)  are  very  fast  operations.  An 
important  benefit  of  this  object-oriented  framework  is  the  potential  for  direct  reuse  by 
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other  FeatureConstructors  of  condition  checks  and  actions  such  as  the  onWater:  and 
stopCreateFeature  methods.  Furthermore,  with  this  system  provision  can  be  made  for 
adding  and  changing  rules  at  runtime. 

The  design  presented  here  is  easily  extended  to  trigger  on  any  kind  of  change 
(create,  modify,  delete)  to  geographic-feature  objects,  as  well  as  to  specific  feature 
attributes  and  spatial  coordinates  of  a  given  feature  object.  This  could  be  a  significant 
advantage  over  the  triggers  supported  by  many  commercial  relational  and  even  hybrid 
object-relational  DBMSs.  Except  for  Sybase,  these  DBMSs  can  typically  trigger  only  on 
insert,  update  or  delete  of  a  complete  feature  record,  rather  than  being  able  to  discriminate 
on  changes  made  to  a  single  feature  attribute. 

It  might  be  noted  that  this  rule-based  framework  is  not  limited  to  implementation 
in  an  object-oriented  system.  While  the  object-oriented  properties  of  hierarchical 
definition  and  inheritance  in  Smalltalk  facilitated  a  simple  design,  the  same  functionality 
could  be  achieved  in  a  non-object-oriented  language  with  appropriate  data  structures  and 
procedures.  It  seems  likely  that  rule-based  frameworks  like  this  could  find  their  way  into 
many  more  kinds  of  applications  in  the  future. 

Supporting  Very  Large  Databases 

As  a  result  of  using  a  commercial  ODBMS  for  the  geo-data  repository,  we  can 
immediately  start  to  consider  working  with  very  large,  distributed  databases.  ObjectStore 
has  been  demonstrated  already  to  support  terabyte-sized  databases,  and  its  client-server 
architecture  with  such  features  as  shared-page  caching  is  well  suited  to  multi-user 
applications.  Other  ODBMSs  are  also  likely  candidates  for  large  applications  like  this.  It 
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was  found  that  spatial  queries  were  as  much  as  hundreds  of  times  faster  when  reading 
from  the  ODBMS  repository  than  from  the  relational  VPF  source  files.  While  further 
tuning  of  either  approach  is  no  doubt  possible,  the  ODBMS  interface  was  far  simpler  to 
design  and  implement,  especially  given  the  need  to  support  topology. 

However,  storing  a  large  database  is  only  part  of  the  problem;  another  is  providing 
access  to  it  with  reasonable  performance.  The  technique  demonstrated  here  of  placing 
each  coverage  in  a  separate  spatial  index  shows  very  good  potential  for  helping  to  manage 
the  visibility  of  unnecessary  data  while  the  user  is  trying  to  identify  a  region  of  interest. 
The  rule-based  concepts  implemented  here  could  also  be  applied  to  facilitate  query 
optimization  across  multiple,  heterogeneous  databases  distributed  over  a  wide-area 
network.  For  example,  rules  could  be  defined  to  check  the  visibility  or  access  priveleges 
of  one  or  more  portions  of  a  distributed  database  before  starting  a  potentially  long 
transaction.  To  support  this,  a  client  ObjectStore  application  could  be  set  up  on  the  host 
server.  This  client  application  could  serve  as  the  effective  host  application  to  the  actual 
users,  filtering  the  data  prior  to  shipping  it  over  the  network.  The  GemStone  ODBMS 
already  is  organized  to  support  this  separation  of  work  between  a  host  application  process 
and  each  client  process. 

It  was  pointed  out  that  each  quadtree  cell  (VPFSpatialDataCell  instance)  holds  a 

collection  of  direct  object  pointers  to  its  geo-features  (see  "Design  of  an  Object-Oriented 

Spatial  Index"  starting  on  page  60).  This  makes  the  quadtree  into  more  than  just  a  spatial 
index,  but  an  efficient,  general  purpose  container  structure  for  the  features  as  well.  This 
design  is  essentially  independent  of  the  actual  VPF  feature  structure,  and  allows  us  to 
modify  the  implementation  of  the  spatial  tree  at  any  time  without  affecting  the  rest  of 
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OVPF  or  the  source  data.  Thus,  in  the  future  we  could  easily  substitute  a  range  tree  (Samet 
1994;  Beckmann  et  al.  1990;  Brinkhoff  et  al.  1993),  or  special  optimizing  techniques  in 
place  of  the  present  quadtree  approach  (this  will  be  discussed  further  below).  This  design 
could  also  support  the  simultaneous  implementation  of  multiple  spatial  indexing  schemes, 
to  allow  choice  of  the  most  efficient  spatial  tree  design  for  a  given  source  database.  This 
might  be  thought  of  as  pluggable  spatial  indexing. 

Another  issue  addressed  by  this  design  that  becomes  more  important  over  time  has 
to  do  with  changes  to  the  nonspatial  attributes  of  geo-features  in  a  given  feature  class  or 
coverage.  As  business  requirements  and  data  sources  evolve  and  change,  it  is  often 
necessary  to  add,  remove,  or  change  the  value  range  of  nonspatial  attributes  for  one  or 
more  feature  classes.  (In  a  complex  database  specification  such  as  VPF,  this  must  be  done 
with  care,  to  ensure  consistency  of  attribute  usage  and  values  across  similar  feature  classes 
in  different  coverages  and  libraries.)  As  described  in  the  Materials  and  Methods  section, 
page  54,  OVPF  makes  use  of  Smalltalk  ReadWriteStream  objects  to  hold  all  such 
attributes  in  a  single  byte-stream,  in  which  each  attribute's  position  and  length  in  the  byte- 
stream  is  known  via  the  feature  class'  schema.  This  rather  non-object-oriented  way  of 
aggregating  many  small  pieces  of  information  is  much  more  memory-efficient  than  if  we 
had  created  separate  instance  variables  and  value-objects  for  each  geo-feature  attribute.  It 
also  supports  changes  in  attribute  structure  for  a  feature  class  without  affecting  the  object 
class  definitions.  This  means  that  attributes  can  be  added  to,  or  removed  from,  the  feature 
class  definition  without  having  to  redesign  the  OVPF  structure.  This  could  even  be  done  at 
runtime.  The  issue  of  updating  older  data  with  the  previous  attribute  structure  to  the  new 
structure  must  still  be  addressed,  which  could  be  difficult  for  large  databases. 
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Supporting  Potential  for  Expert  System  Applications 

Attempts  and  progress  are  being  made  to  cope  with  the  complexity  of  decision 
making  faced  by  planners  through  the  use  of  expert  systems.  The  framework  presented 
here  seems  to  be  a  good  candidate  for  further  research  in  this  area.  With  Smalltalk's 
reflective  and  dynamic  compilation  capabilities,  there  are  no  inherent  limits  to  the  ability 
to  create  and  modify  a  rule-base  of  events,  conditions  and  actions  at  runtime.  This  implies 
the  system  could  have  the  capability  to  learn  and  evolve  its  semantics  (general  behavior) 
as  a  function  of  usage  patterns  and  environmental  conditions. 

Limitations  of  the  Present  Application 

The  various  frameworks  described  in  this  thesis  present  numerous  possibilities  for 
extensions  and  enhancements.  However,  there  remain  some  significant  limitations  in  the 
present  implementation  of  OVPF.  These  are  grouped  here  according  to  (1)  feature  class 
definitions,  (2)  spatial  index,  (3)  GIS  functionality,  (4)  the  rule-based  framework,  and 
(5)  the  Smalltalk  language. 

Feature  Class  Definitions 

The  present  design  for  OVPF  lacks  support  for  variable-length  feature  attributes. 
It  has  been  found  that  some  feature  classes  in  the  Vector  Smart  Map  (DMA  1993c)  and 
Urban  Vector  Smart  Map  (DMA  1994b)  specifications  have  variable-length  text  attributes, 
whereas  all  other  nonspatial  attributes  in  VPF  databases  have  been  fixed-length  in  nature. 
To  accommodate  this,  we  would  most  likely  follow  the  VPF  specification  for  storing  such 
data  in  the  feature  tables,  by  including  the  integer-length  of  the  attribute  as  the  first  four 
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bytes  of  the  attribute's  value.  Since  the  feature  table  schema  states  which  attributes  have 
variable  length,  it  would  be  a  straightforward  matter  to  modify  the  standard  accessing 
protocol  in  VPFFeature  ("valueForAttribute:  aName"  and  "putValue:  aValue  forAttribute: 
aName"  methods)  to  properly  handle  these  exceptions. 

Spatial  Index 

The  simplistic  quadtree  implemented  so  far  has  a  number  of  shortcomings,  as 
observed  by  Cobb  et  al.  (1995b).  For  example,  all  spatial  queries  begin  from  the  root  (top) 
of  the  tree.  Cobb's  spatial  splay  tree  approach  addresses  this  issue,  by  storing  pointers  to 
the  most  recently-accessed  quadtree  cells  at  or  near  the  root  of  the  splay  tree.  This  has 
been  found  to  result  in  significant  improvement  in  query  performance. 

Another  issue  however,  is  that  of  managing  insertion  of  geo-feature  objects  in  the 
index  that  fall  on  a  quadtree  cell  boundary.  When  this  happens,  the  geo-feature  pointer  is 
moved  back  up  one  level  in  the  quadtree,  to  the  next-larger  cell.  With  small  or  sparse 
geographic  databases  this  is  not  a  problem,  but  with  dense  coverages  this  can  degrade 
access  performance.  One  way  of  overcoming  this,  while  preserving  the  advantages  of  the 
quadtree  approach,  is  to  use  overlapping  quadtree  cells.  In  this  case,  each  quadtree  cell 
could  overlap  its  neighbor  by  up  to  25  percent,  to  hold  any  geo-features  which  coincide 
with  its  boundary.  By  traversing  the  cells  at  a  given  level  in  a  consistent  order,  say 
clockwise,  each  feature  would  still  have  a  unique  index  key.  While  I  have  not  seen  any 
technical  documentation  on  this  approach,  both  Laser-Scan  (1995a)  and  Smallworld 
(1995)  use  it  for  their  spatial  indexes. 
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GTS  Functionality 

While  a  significant  amount  of  work  has  been  done,  much  more  functionality  is 
required  for  the  OVPF  application  to  represent  a  GIS  according  to  standard  industry 
guidelines.  So  far,  changes  to  topology  have  limited  support.  While  the  data  structures  are 
essentially  complete,  further  work  is  required  to  allow  any  kind  of  change  in  topology  to 
be  correctly  handled.  Chung  et  al.  (1995)  describes  the  operations  which  are  now 
supported  and  those  which  are  still  needed.  Topology  support  affects  not  just  data 
integrity,  but  also  the  range  of  spatial  analysis  functions  which  can  be  performed. 
Presently  OVPF  incorporates  very  few  of  the  "usual"  spatial  analysis  procedures 
(Goodchild  1988,  1994;  Tomlin  1990).  For  example,  it  lacks  functions  such  as 
edgematching,  dissolving  lines  and  merging  attributes,  line  thinning,  weeding  and 
smoothing,  centroid  calculation,  and  others.  Those  which  are  implemented  are  part  of  the 
standard  Smalltalk  library,  such  as  raster-vector  transformations  and  point-in-polygon 
testing,  and  even  these  may  need  further  optimization. 

Another  shortcoming  is  that  OVPF  presently  supports  only  very  simple  queries. 
The  protocol  for  spatial  queries  is  limited  to  returning  all  geo-features  within  a  specified 
area,  without  regard  for  filtering  criteria  based  on  nonspatial  attributes.  The  versatile 
ReadWriteStream  approach  for  storing  attributes  has  the  drawback  that  we  must  always 
interpret  the  byte  stream  to  perform  comparisons  on  any  values  stored  in  the  stream.  For 
faster  performance  on  queries,  it  may  be  helpful  to  create  individual  indexes  on  specific 
nonspatial  attributes,  for  instance  by  storing  such  attribute  values  and  geo-feature  object 
pointers  in  a  hash  dictionary  or  binary-tree  structure. 
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Rule-Based  Framework 

There  are  a  number  of  issues  related  to  supporting  rules  that  have  not  been 
addressed  as  yet.  The  number  of  rules  defined  in  an  application  can  become  very  large  and 
may  be  defined  by  various  users  at  different  points  in  time.  This  can  lead  to  the  problem  of 
having  inconsistent  or  conflicting  rules  present  within  the  application.  For  example,  user 
A  may  define  a  rule  Rl  whose  action  may  trigger  rule  R2  defined  by  user  B.  Suppose  rule 
R2's  action  results  in  triggering  rule  Rl,  thereby  yielding  an  infinite  loop.  From  this 
scenario,  it  is  evident  that  a  mechanism  for  establishing  the  consistency  or  correctness  of 
rules  must  be  an  inherent  part  of  any  active  system.  This  involves  writing  algorithms 
which  statically  detect  rule  conflicts  as  well  as  algorithms  which  dynamically  detect 
problems  such  as  infinite  rule  triggering  (Arctur  et  al.  1995d). 

Further  work  is  also  needed  in  the  development  of  a  user  interface  for  development 
and  modification  of  the  rule  base.  This  interface  could  provide  a  graphical  representation 
of  a  state  machine  allowing  the  user  to  define  and  modify  the  events,  conditions  and 
actions  for  each  rule.  This  could  be  a  component  of  a  graphical  flowchart-like  facility 
allowing  the  user  to  create  and  modify  step-by-step  feature  construction  scripts. 

The  rule-based  framework  implemented  so  far  represents  a  data-driven  (also  called 
bottom-up  or  forward  chaining)  system  with  very  primitive  inferencing  capability.  Goal- 
driven  (top-down  or  backward  chaining)  capability  would  also  be  important  for 
application  in  an  expert  system,  and  this  will  require  further  research  and  development 
efforts. 

An  important  next  step  would  be  to  extend  the  proof-of-concept  developed  here 
into  a  larger  study  of  the  rule-based  framework  with  more  realistic  data  from  various  GIS 
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application  domains.  Navinchandra  (1993,  pp.  87-89)  points  out  a  number  of  issues 
related  to  the  application  of  artificial  intelligence  techniques  to  GIS.  One  of  the  key  issues 
is  the  potential  infeasibility  of  applications  to  scale  up  adequately  from  the  research 
prototype  to  realistic  production  models.  This  is  particularly  true  with  a  moderate  to  large 
rule  base,  due  to  interactions  among  the  rules  that  become  difficult  to  anticipate  and  test. 

Smalltalk  Language 

This  entire  application  has  been  implemented  in  Smalltalk.  This  includes  such 
significant  components  as  the  relational  data  import  and  export  facilities,  metadata  object 
hierarchy,  the  feature  data  hierarchy,  topology  hierarchy,  spatial  quadtree  index,  graphical 
user  interface,  low-level  graphical  object  representations  and  operations,  and  byte-level 
data  format  conversions.  The  performance  of  this  system  has  so  far  been  acceptable  with 
geographic  databases  up  to  about  30  megabytes  in  size.  Significant  tuning  can  be 
performed  using  Smalltalk's  own  capabilities;  however,  it  is  likely  that  some  portions  of 
the  system  such  as  data  conversions  and  graphical  rendering  could  be  better  implemented 
in  a  platform-dependent  manner  using  C  or  even  Assembler,  which  can  be  invoked  from 
Smalltalk.  ParcPlace's  VisualWorks  Smalltalk  code  is  platform-portable  however,  which 
is  a  great  advantage  for  both  development  and  maintenance. 

One  limiting  factor  inherent  in  VisualWorks  has  been  a  practical  upper-bound  on 
the  size  of  hash  dictionaries,  which  are  (generally)  efficient  accessing  structures  provided 
as  part  of  the  Smalltalk  system  class  library.  These  dictionaries  lose  their  effectiveness 
when  attempting  to  store  more  than  about  16,000  elements.  This  limits  their  usefulness  for 
indexing  geo-feature  nonspatial  attributes,  as  there  can  be  many  more  than  16,000 
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instances  of  a  given  feature  class  within  a  coverage.  Possible  workarounds  for  this 
limitation  include  subdividing  large  collections  into  smaller  groupings,  so  that  no  single 
grouping  has  more  than  16,000  elements.  Perhaps  a  later  version  of  Smalltalk  will 
overcome  this  issue. 

Future  Directions 

With  the  change  to  support  variable-length  attributes  described  above,  it  is 
conceivable  that  the  feature  attribute  structure  described  here  could  be  extended  to  store 
time-series  or  other  temporally-based  versions  of  nonspatial  data  for  a  given  attribute.  For 
example,  suppose  that  a  particular  road  or  bridge  were  seasonally  out  of  operation  due  to 
high  water.  The  attribute  describing  its  operational  status  could  be  stored  with  multiple 
values  associated  with  different  time  periods.  This  is  not  part  of  the  VPF  specification  at 
present,  but  could  be  accommodated  by  OVPF's  structure.  A  more  difficult  problem 
would  be  to  store  and  manage  multiple  editions  of  graphical  primitives  based  on  temporal 
data.  For  example,  to  store  a  set  of  shorelines  or  hydrographic  depths  representing 
different  tidal  levels  would  require  much  more  consideration  in  design  than  is  needed  for 
nonspatial  attributes. 

A  promising  application  for  the  rule-based  framework  described  here  could  be  in 
maintaining  spatial  topology  following  changes  to  geo-feature  locations.  One  possibility 
would  be  to  extend  or  mimic  the  FeatureConstructor  concept  to  support  the  use  of 
GraphicalPrimitiveConstructors.  These  objects  could  be  actuated  to  carry  out  the 
necessary  steps  for  splitting  and  splicing  Edge  and  Face  primitives,  creating  Nodes  at 
Edge  and  Face  intersections,  and  other  such  operations.  The  optimal  sequencing  of 
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topology  maintenance  operations  can  become  complicated  with  dense  coverages.  By 
integrating  a  rule-based  framework  with  GraphicalPrimitiveConstructors,  it  might  be 
possible  to  direct  the  sequence  of  topology-building  steps  based  on  runtime  conditions  and 
relative  priority  of  the  mathematical  graph-related  procedures.  Due  to  the  computational 
intensity  of  this  kind  of  processing,  it  would  be  essential  to  tune  such  a  framework  for 
maximum  performance. 

As  already-large  databases  get  much  larger,  it  will  be  increasingly  important  to 
fully  exploit  parallel  processing  and  data  channel  capabilities.  It  would  be  interesting  to 
explore  ways  of  using  the  rule-based  framework  to  help  determine  optimal  loading  of 
computational  resources  to  improve  performance  of  both  queries  and  data  updates. 

Summary 

The  research  and  development  described  in  this  thesis  represents  a  unique 
application  in  a  number  of  ways.  It  is  one  of  the  first  frameworks  for  GIS  designed  and 
implemented  with  Smalltalk,  and  shows  the  great  leverage,  versatility  and  expressiveness 
of  this  language  and  development  environment.  The  object-oriented  data  model  has 
demonstrated  itself  to  be  extensible  enough  to  accommodate  multiple  simultaneous 
database  schemata.  It  is  also  one  of  the  first  GIS  frameworks  that  uses  a  commercial 
ODBMS  to  store  all  the  spatial  and  nonspatial  data  in  a  consistent  and  extensible  manner. 
Finally,  it  demonstrates  a  comparatively  "low-overhead"  approach  to  integrating  a  rule- 
based  framework  and  active  database  capability  in  a  GIS.  While  much  further  work  is 
needed  to  carry  it  beyond  a  "proof  of  concept"  stage  to  be  useful  in  spatial  analysis,  this 
framework  offers  great  promise  for  such  efforts. 
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It  is  my  belief  that  object-oriented  GIS  (OOGIS)  can  facilitate  collaborative 
decision-making  on  difficult  issues  of  global  scope,  such  as  deforestation,  toxic  waste 
handling,  and  distribution  of  physical  and  financial  resources.  No  single  government  or 
multinational  corporation  can  possibly  have  full  understanding  of  all  the  consequences  of 
a  given  decision  and  course  of  action  with  respect  to  such  global  issues.  Nor  should  any 
one  government  or  other  organization  be  solely  responsible  for  implementation  of  policies 
and  activities  that  may  be  needed.  Thus,  collaboration  at  many  levels  across  national, 
organizational  and  class  boundaries  is  required.  Others  have  done  work  which  could  be 
useful  in  this  direction,  such  as  Nyerges  (1993)  and  Karnes  (1995),  and  significant  work 
has  been  reported  on  technologies  for  groupware  and  computer-supported  cooperative 
work  (Baecker  1993;  Furuta  and  Neuwirth  1994).  With  the  rapidly  accelerating 
acceptance  and  use  of  Internet,  combined  with  advanced  GIS  facilities,  it  should  be 
possible  to  support  continuous,  real-time  communications  and  updates  among  GIS 
databases  for  use  by  participating  domain  experts  and  decision  makers  in  geographically 
dispersed  locations  around  the  world.  To  this  end,  the  OOGIS  and  knowledge-base 
capabilities  mentioned  in  this  thesis  seem  well  suited.  It  is  my  hope  that  the  work  started 
here  can  continue  in  this  direction. 
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